(navigation image)
Home American Libraries | Canadian Libraries | Universal Library | Community Texts | Project Gutenberg | Children's Library | Biodiversity Heritage Library | Additional Collections
Search: Advanced Search
Anonymous User (login or join us)
Upload
See other formats

Full text of "Naval research logistics quarterly"

Hi: 



2S MAR 711 






NAVAL R£S€flflCH 

L 




OUflRTfRLy 






DECEMBER 1970 
VOL. 17, NO. 4 




OFFICE OF NAVAL RESEARCH 

NAVSO P-1278 



Hol-ft 



NAVAL RESEARCH LOGISTICS QUARTERLY 



EDITORS 



H. E. Eccles 
Rear Admiral, USN (Retired) 

F. D. Rigby 
Texas Technological College 



O. Morgenstern 

New York University 

D. M. Gilford 
U.S. Office of Education 



S. M. Selig 

Managing Editor 

Office of Naval Research 

Arlington, Va. 22217 



ASSOCIATE EDITORS 



R. Bellman, RAND Corporation 

J. C. Busby, Jr., Captain, SC, USN (Retired) 

W. W. Cooper, Carnegie Mellon University 

J. G. Dean, Captain, SC, USN 

G. Dyer, Vice Admiral, USN (Retired) 

P. L. Folsom, Captain, USN (Retired) 

M. A. Geisler, RAND Corporation 

A. J. Hoffman, International Business 
Machines Corporation 

H. P. Jones, Commander, SC, USN (Retired) 

S. Karlin, Stanford University 

H. W. Kuhn, Princeton University 

J. Laderman, Office of Naval Research 

R. J. Lundcgard, Office of Naval Research 

W. H. Marlow, The George Washington University 

B. J. McDonald, Office of Naval Research 

R. E. McShane, Vice Admiral, USN (Retired) 

W. F. Millson, Captain, SC, USN 

H. D. Moore, Captain, SC, USN (Retired) 



M. I. Rosenberg, Captain, USN (Retired) 

D. Rosenblatt, National Bureau of Standards 

J. V. Rosapepe, Commander, SC, USN (Retired) 
T. L. Saaty, University of Pennsylvania 

E. K. Scofield, Captain, SC, USN (Retired) 
M. W. Shelly, University of Kansas 

J. R. Simpson, Office of Naval Research 
J. S. Skoczylas, Colonel, USMC 
S. R. Smith, Naval Research Laboratory 
H. Solomon, The George Washington University 
I. Stakgold, Northwestern University 
E. D. Stanley, Jr., Rear Admiral, USN (Retired) 
C. Stein, Jr., Captain, SC, USN (Retired) 
R. M. Thrall, Rice University 
T. C. Varley, Office of Naval Research 
C. B. Tompkins, University of California 
J. F. Tynan, Commander, SC, USN (Retired) 
J. D. Wilkes, Department of Defense 
OASD (ISA) 



The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics and 
will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, 
relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. 

Information for Contributors is indicated on inside back cover. 

The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, June, 
September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printing 
Office, Washington, D.C. 20402. Subscription Price: $5.50 a year in the U.S. and Canada, $7.00 elsewhere. Cost of 
individual issues may be obtained from the Superintendent of Documents. 

The views and opinions expressed in this quarterly are those of the authors and not necessarily those of the Office 

of Naval Research. 

Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regulations, 

NAVEXOS P-35 



Permission has been granted to use the copyrighted material appearing in this publication. 



A GENERALIZED UPPER BOUNDING METHOD FOR DOUBLY 
COUPLED LINEAR PROGRAMS 



James K. Hartman 

Naval Postgraduate School 
Monterey, California 

and 

Leon S. Lasdon 

Case Western Reserve University- 
Cleveland, Ohio 



1. INTRODUCTION 

The constraints of large linear programs can often be partitioned into independent subsets, except 
for relatively few coupling rows and coupling columns. The individual subsets may, for example, arise 
from constraints on the activity levels of subdivisions of a large corporation. Alternatively, such blocks 
may arise from activities in different time periods. The coupling rows may arise from limitations on 
shared resources or from combining the outputs of subdivisions to meet overall demands. The coupling 
columns arise from activities which involve different time periods (e.g., storage), or which involve dif- 
ferent subdivisions (e.g., transportation or assembly). The case with only coupling rows or coupling 
columns, but not both has received much attention [8], [9]. A smaller amount of work has been done 
on the problem with both coupling rows and columns. Ritter has proposed a dual method [2], [7]. 
Except for the preliminary work of Webber and White [10] and Heesterman [4], there is no primal 
algorithm which exploits the structure of this problem. There is a need for such an algorithm, since 
such problems occur often in practice. A primal method is desirable since in large problems slow con- 
vergence may force termination of the algorithm prior to optimality. 

The algorithm proposed here is an extension of the generalized upper bounding method for prob- 
lems without coupling columns proposed in [5] and [6]. It produces the same sequence of extreme 
point solutions as the primal simplex method, and hence has the desirable convergence properties of 
that algorithm. However, the operations within each simplex iteration are organized to take maximal 
advantage of problem structure. Because of this structure it is possible to perform the computations 
while maintaining a compact representation of the basis inverse. In particular it is sufficient to main- 
tain and update at each cycle a working basis inverse, inverses from each block, and possibly another 
matrix, V in (4). The dimension of the working basis need never be more than the number of coupling 
rows in the problem plus the number of coupling columns in the current basis. Hence its dimension 
may change from cycle to cycle. At most one of the block inverses need be updated at any interation. 
Given these quantities, all information needed to carry out a simplex iteration can easily be obtained. 
Further, efficient relations are given for updating the working basis and block inverses and V for the 
next cycle. 

A significant amount of computational work has been done on a special class of production and 
inventory problems. Problems as large as 362 rows by 3225 columns were solved. Computation times 

411 



412 



J. K. HARTMAN AND L. S. LASDON 



are encouraging, although no comparisons with other methods are available. The results indicate that 
the algorithm is sensitive to the dimension of the working basis and that effective measures can be 
taken to minimize this dimension. 

When there are only coupling rows, the algorithm simplifies considerably and reduces to the 
procedure described in [5]. When each diagonal block contains only a single row, it reduces further 
to the generalized upper bounding method of Dantzig and Van Slyke [1]. 

2. BASIS STRUCTURE 

The problem considered here is 



subject to 



minimize .toi 



B x + ^AiXi= b 



Di.x + BiXj = bt i = 1 , 



where X{ is a vector with rn components, all of which must be nonnegative except for xoi, the first com- 
ponent of x - The column corresponding toxoi is all zero except for a 1 in the first row. This first row of 
the constraint matrix then defines the objective function. In matrix form the constraints can be 
represented: 



Xi 



X-l 



Bo 


A, 


A 2 


A 


ft 




A> 


—~^-~' 


B 2 



Xp — l 


Xp 


A p -\ 


A P 









No. of 




rows 


= b 


m 


= h 


mi 


= b 2 


m 2 



(1) 



Dp-x 



D D 



4> 



Jo Jl -J2 

No. of columns 
n rii n-i 



Bp-i 




"— ' 


B P 



— bp-i 


TTlp-i 


= b P 


rn p 




total = 




M rows 



On-l Sr 



rip-i rip total= N columns 



The problem has p diagonal blocks of dimension m; X m and an L-shaped border consisting of m 
coupling constraints and n coupling variables. The set Si denotes either the variables in the vector 
Xi or the columns of (1) associated with these variables. The total constraint matrix has dimension 
MXN. We assume throughout that this matrix has rank M so that any basis matrix for the system will 
contain M columns and will be nonsingular. 



DOUBLY COUPLED LINEAR PROGRAMS 



413 



Consider the structure of any basis, B, for the system of constraints (1). Arranging the columns of 
B in the same order as in (1), yields the basis matrix in Figure 1 where the shaded areas contain nonzero 
entries. The column for .r i is always in the basis and will always be the first basic column. 




t o + l es, eS 2 eS 4 eS 5 

COUPLING 
COLUMNS 

es„ 



es 6 €S 7 



Figure 1 

In this example the problem has seven blocks. There are no columns from block 3 in this particular 
basis. The basis matrix consists of q rectangular blocks, q =£ p, roughly along the diagonal with borders 
from the coupling rows and coupling columns. The rectangular blocks may be either "tall," "square," 
or wide. 

The algorithm to be developed depends on having the lower right hand partition of B consist of 
square nonsingular blocks along the diagonal. Our strategy for handling the nonsquare blocks will be to 
rearrange the rows and columns of the basis matrix as follows: 

a) For blocks with a column excess, move the extra columns over beside the coupling columns 
leaving a square block. 

b) For blocks with a row excess, move the extra rows up with the coupling constraint rows leaving 
a square block. 

c) Assure that the resulting square diagonal blocks are nonsingular. 

Performing these operations on the basis B in Figure 1 will give the matrix in Figure 2, where we have 




m COUPLING 
ROWS 

[excess ROWS 

r FROM BLOCKS 
3, 4, AND 5 



S ROWS IN 

BLOCK 
DIAGONAL 
SECTION 



l + I EXCESS S COLUMNS IN 

COUPLING COLUMNS BLOCK DIAGONAL 

COLUMNS FROM S|,S 5 , SECTION 

€S„ AND 



Figure 2 



414 



.1. K. HARTMAN AND L. S. LASDON 



labeled the resulting nine submatriees for reference in Theorem 1. The submatrix y 3 contains square 
nonsingular blocks and hence it is nonsingular. For this example block 5 was still singular after it was 
made square, so extra row-column pairs were removed from it until the remaining block became non- 
singular. This introduced more excess rows and columns and created nonzero elements in the sub- 
matrix fa. 

For computational purposes the block diagonal section, y 3 should be as large as possible, but still 
nonsingular. A lower bound on its size is now derived, 

THEOREM 1: Let S be the dimension of the block diagonal section 73 of Figure 2, and M the 
dimension of the entire basis matrix. Suppose that the partitioning has been done so all diagonal 
blocks are nonsingular, and that there is no alternate partitioning which would give larger nonsingular 
blocks. Then S 3= M — 1 — m . 

PROOF: Let any basis matrix for (1) be partitioned as illustrated in Figure 2; 



B 



Oil 


ft 


y\~ 


a 2 


ft 


y-i 


«3 


0a 


73 _ 



The block diagonal submatrix y 3 is nonsingular, and hence has rank S. Consider the submatrix [fa, 73] ■ 
It also has rank S since it has S linearly independent columns (those of y 3 ) and only S rows. Suppose 
there is a row [/3, 7] of the submatrix [/3 2 , 72] which is not a linear combination of the rows of [/3 3 , 73] 
and, say, this row is an excess row from block). By the special structure of this row (7 is zero except 

"J3 Z 



in block j) there must be at least one excess column in 



L/83J 



from the same block j, (otherwise the row 



[/3, 7] could not be independent). Since the row [/3, 7] is independent of the rows in the 7th block of 
[fa, 73] , this row and one of the excess columns can be adjoined to the,/th block to give a larger square 
nonsingular block than before. But this contradicts the hypothesis of the theorem. Hence every row 
of [j3>, y 2 ] is a linear combination of rows of [/3 3 , 73]- This proves that the submatrix 



K = 



fa 72 
./3s 73 



has rank S. Adjoin the first / +l columns to K giving the submatrix L — 



L has rank 



a 2 fa y> 
. a 3 fa 73 J 

^S + l since only / +l columns were added and the first of these is all zero in these rows. Finally 
adjoin the first m rows to L giving the entire basis matrix 



B 



«i /3i 71 

«2 02 72 
_«3 fa 73 



Then B has rank =s S + l + m since only m rows have been added to L and hence there can be at most 
m more independent rows in B than in L. But rank B = M, so that M «s S + l + m , orS 3= M — l — m 
completing the proof of Theorem 1. 



DOUBLY COUPLED LINEAR PROGRAMS 



415 



3. THE ALGORITHM 

Our approach to the solution of doubly coupled problems will be to apply the revised primal 
simplex method to the problem. The special structure of the basis matrix will be exploited to simplify 
the computational and storage problems which occur for large problems. The revised simplex method 
involves the following steps at each iteration: 

a) Find the simplex multipliers 7r = c/ ; B~ 1 

b) Price out the nonbasic columns Pj, 

Cj=c j -irP j 

and choose a column P s with negative c s to enter the basis if the current solution is not optimal. 

c) Transform the entering column in terms of the current basis, 

A = B"'P S . 

d) Determine the column to leave the basis, 

. X Bi X Hr 
a is Clrs 

where a,. s is component i of P s . 

e) Pivot to update B" 1 and the current solution to account for the basis change. 

Steps a) and c) involve multiplication by the basis inverse matrix B" 1 . To maintain B~' for large 
problems requires extensive storage and computation, since even though B has special structure, B ' 
is essentially dense with nonzero elements. Instead of performing the computations in a) and c) directly, 
we will solve the equations 



(2) 
and 

(3) 



7rB = C/j (solve for it) 



BP s = P s (so\veforP s ) 



by making a transformation of the basis matrix B— »R where R is block triangular. The resulting 
equations in terms of R will make it possible to exploit the special structure of B. 

Consider a nonsingular matrix T defined so that BT= R is upper block triangular, 



B 



(4) 



// 



/>', 



/ 





V 


I 



R 



B 


H 





B x 



where B\ is the block diagonal section y 3 , G, H, and J are the remaining partitions of B 



e.g. G 



on fix 



and 



416 J- K. HARTMAN AND L. S. LASDON 

(5) V=-B V i J. 

The matrix B is central to the procedure and will be called the working basis. From (4) and (5) we see 
that 

(6) B = G + HV=G-HB^J. 

By Theorem 1, B can always be partitioned so that the dimension of B is at most m + U. 

THEOREM 2: The working basis B is nonsingular. 

PROOF: R is nonsingular since both B and T are. Since all elements of R below B are zero, 
B must be nonsingular. 

To find the simplex multipliers, consider Eq. (2). Multiply through by T giving 

(7) 7rR = 7rBT=c B T=(l,0, . . ., 0)T=(1,0, . . ., 0). 

The last equality holds since only .toi appears in the objective function, and by convention .toi is always 
the first basic variable. Since T is nonsingular, this multiplication will not change the solution tt. 
Now partition tt as tt = (tt , ttu . . ., 7r 9 ), where 

TT has MS components and is the multiplier for rows in the working basis B. 

ttj has Sj components and is the multiplier for rows in theyth diagonal block Bj\ which is Sj X Sj and 
nonsingular, j= 1, . . . , q. From the structure of R in (4), Eq. (7) becomes 

(8) 7r o 5=(l,0, . . ., 0) (MS components), and 

(9) tt Hj + ttjBji = (sj components) 7= 1, . . ., q, 

where //, is the submatrix of H containing the Sj columns in they'th block. Solving (8) gives 

(10) 7T o =(l,0, . . ., 0)fi" 1 = first row of B- 1 , 
which can be substituted into (9) to give 

.(11) ir^-iroHjByS /=1, - - . ., q. 

Hence to obtain the simplex multipliers it suffices to know the inverse of the working basis and the 
inverse of each of the diagonal blocks. 

Next consider Eq. (3) for transforming the entering column in terms of the current basis. Define a 
nonsingular change of variables by 

(12) z=T~iP s . 

Then P s = Tz, so Eq. (3) becomes 

(13) Rz=BTz=BP s = / > s , 

which is a block triangular system and hence easier to solve than (3). Partition z and P s as (z , zi, . . . ., 
z q ) and (P S o, Psi, . . . ., P S q) , respectively. Then (13) can be written as 



DOUBLY COUPLED LINEAR PROGRAMS 417 



(14) Bz +^HjZj=P s 

(15) Bj lZj = P sj ;=1, . . 
Solving (15) gives 

(16) Zj = Bj^P sj , 
which when substituted into (14) gives 



(17) z =B 



Ap so -±H } z\. 
j=i J 



If P s is not a coupling column, these formulas simplify since than P SJ = for all but one j( = 1, . . . ., 
q). Hence by (16)z/ = for all but one /' and the summation in (17) has only one term. 

It is now a simple matter to obtain P s from z using P s = Tz. With P s partitioned as (P s o, Psi, ■ ■ •, 
P S q) the structure of T in (4) gives 

(18) Pso = lz = z , and 

( 19) P*) = Vjzo + Izj =VjZ + Zj j= 1, . . . , q, 

where Vj is the submatrix of V containing the rows in the 7th block. This completes the transformation 
of the column entering the basis. 

4. CHANGING THE BASIS 

Pricing out columns to select the entering column and choosing the column to leave the basis are 
done as in the ordinary revised simplex method, so it only remains to describe the updating procedures. 
Since the entire basis inverse B" 1 is not needed, the updating requirements will be somewhat different 
from those of the ordinary revised simplex algorithm. In particular, reviewing the solution of Eqs. (2) 
and (3) shows that it is sufficient to update B ', fij, 1 (7= 1, . . ., q), and V at each iteration. 

Before describing the various cases which can occur, we derive a general result for updating the 
working basis inverse. 

THEOREM 3: If a basis change can be described by *B _1 = EB~ 1 , where *B _1 is the basis in- 
verse after the change, and E is any transformation matrix, then the working basis inverse can be 
updated by 

(20) *B-< = (E l +E 2 V)B-\ 

where E x and E> are partitions of E to be described in the proof. 

PROOF: From the definition and nonsingularity of R and T, in (4), (5) 

(21) *R-' = *T '*B ' = *T 'EB =*T ETR 1 . 

Here all *'ed symbols relate to the system after the basis change is made. Writing (21) in partitioned 
form and using partitioned inverse theorems gives 



418 



J. K. HARTMAN AND L. S. LASDON 



(22) 



*g -1 


_*g-l*//*fl r l 





*#r* 



/ 





_ *y 


I 



£. Et 

Ez Ea 



I 





V 


1 



B ' 


-b 'f/Br 1 





fir 1 



To update the working basis inverse, note that it appears in the upper left hand partition of *R _1 . Hence 
deleting all but that submatrix gives the following specialization of (22): 



*B 



I 






£, 


E-, 


E, 


E 4 



I 





V 


I 



B 







(23) 



£, E, 



B 



VB 



= (E x + E-iV)B- 



Thus E\ + E>V is a transformation matrix for the working basis inverse. 

Let the columns from the block diagonal section of B be called "key" columns. The remaining 
columns of B are called "non-key." There are several cases to consider in the updating procedure: 

CASE 1: The column leaving the basis is non-key. Then the entering column can be brought into 
the basis as a non-key column, and the standard updating formula for B _1 is *B _1 = EB _1 . Here E 
is an elementary matrix, equal to the identity except in the rth column which is given by [171, 7/2, • • ., 
t/a/]', where 



f)i 



(24) 



a is 






., M 



Vr : 



a rs 



(recall that the rth basic column is leaving the basis). Since the leaving column is non-key, partitioning 
E as in Theorem 3 gives E> = 0, and E\ is an M — SX M — S elementary matrix equal to the identity 
except in the rth column which is (t/i, . . ., t)m-s)'- 

To update the working basis inverse for Case 1, substitute these into (20): 



(25) 



*B-*={Ei + EiV)B l ={E x + OV)B l = E { B 



Hence B~ x is updated by a single pivot operation. 

None of the key columns in B change in Case 1, and hence none of the diagonal blocks Bj\ in B\ 
will change. Thus the block inverses, Bj] 1 , require no updating. To update ^recall that *V=— *Br l *J. 
Now *Br } — Bf 1 as shown above, and *J is different from ,/ only in the rth column which is replaced 
by the last S components of the entering column {P s \, ■ • -, Psq)' ■ Thus only the rth column of V will 
change, and this will be replaced by 



DOUBLY COUPLED LINEAR PROGRAMS 



419 



(26) 



■flrH/V 



., /%)' = -(;,. 



■7 Zq) j 



which has already been calculated in transforming the entering column. This completes the updating 
calculations for Case 1. 

CASE 2: The column leaving the basis is a key column (from the block diagonal section). 

CASE 2a: Both the leaving column and the entering column are from the same block Bj\. Then 
if the entering column will leave Bj\ nonsingular, a direct pivot can be performed and the basis change 
can be described by *B -1 = EB"\ where E is an elementary column matrix. The E matrix differs 
from that in Case 1 only because the (tji, . . ., t/m)' column is now in the key section of the matrix. 
Thus E\ — Im-sxm-s, and Ei has only one nonzero column (t/i, . . ., tjm-s)', which is the r— (MS) th. 
To update the working basis inverse apply formula (20): 



B->=(E,+E-,V)B- 



(27) 



= /+ 



= /+ 



i7i 
• 



V)M-S 



V\B 



T7l 



T)M-S_\ 



fl-' + 



[v] \B 



vB 



where v is the r — (MS) th row of V. 



In the block diagonal submatrix B u only the r — (MS)th column will change. Hence only one diagonal 
block inverse, say Bj x \ will change. Bj^ will be updated by a single pivot, or equivalently, 



(28) 



*BjS = EBjS 



where E is an elementary column matrix formed from the elements of the new column. This change 
will leave Bj\ nonsingular if and only if the pivot element in this new column is nonzero. This condition 
must be checked to see if Case 2a can be applied. 

The J submatrix will not change, and only the 7th block of fir 1 changes, so in V= — Br^J only the 
y'th partition will change. The updated version is given by 



(29) 



%=- (*BjS)J } =-EB T Mj=EV } , 



so the same elementary matrix is used to update V. 

CASE 2b: The column leaving the basis is a key column from block Bj\. Case 2a cannot be used 
either because the entering column is not from the same block (it may be a coupling column) or because 
the nonsingularity test in Case 2a failed. Then we can avoid making Bj\ smaller only if there is an 
excess column (see Figure 2) from block j which can be exchanged with the leaving column. If there is 
such a column, say the /rth-basic column, then after the exchange *B = BE, where E is a permutation 
matrix, an identity matrix with the rth and Ath columns exchanged. Since E _, = E, *B' = EB '. 



420 



J. K. HARTMAN AND L. S. LASDON 



Then (20) yields *B~ 1 = (Ei + E 2 V)B- 1 , where E x is an M — S X M — S identity except for a zero in the 
Ath diagonal position, and Ei is zero except for a one in the Ath row and r— (M — S)th column. 
Hence 



(30) 



*B- 1 =(E l +E>V)B l 




row k, 



where v is the r — (MS)th row of V. This is a valid interchange if and only if E\ + E->V is nonsingular, 
which is true if and only if the Ath element of the row v is nonzero. If all elements of v in the excess 
columns are zero, then no interchange is possible and we proceed to Case 2c. 

If an exchange occurs, one column of the block Bj\ will change. Hence the inverse can be updated 
by a simple pivot 



(31) 



*fljl> = ££/,', 



where E is an elementary matrix. The elements needed to form the eta column in E are found in the 
Ath column of Vj. 

To update V, note that by definition *V= — *Br i *J. Only the 7th block of Br 1 changes, and a 
single column of J changes, but this column is zero outside the y'th block Jj. Hence only the submatrix 
Vj needs to be updated, 



(32) 



*Vj = -*BrS*Jj = -EB Tl i *J j . 



Now *Jj = Jj except in the Ath column which is replaced by the leaving column which we will suppose 
to be the /th column in Bj\. Thus Bjx l *Jj — —Vj except in the Ath column which becomes the /th unit 
vector, so that 



(33) 



*Vj = EVj 



except in the Ath column which is the negative of the eta column of E. 

This completes the updating required for an exchange of key and non-key basic columns. After 
the exchange the leaving column is non-key, so Case 1 can be used to bring the new column into the 
basis. It should be noted that Case 1 uses the elements a, s of the transformed entering column (see 
(24)) and the vector (z\, . . ., z q ) (see (26)). The interchange will affect these quantities. To update 
them interchange a rs and a^s, and replace Zj by Ezj. 

CASE 2c: The column leaving the basis is a key column from block Bj\, and neither Case 2a nor 
2b apply. When the column leaves, the new *Bj\ will have one less column, and hence to remain square 
and nonsingular it must also lose a row. The process is most easily described by a repartitioning step 
followed by an application of Case 1. In the repartitioning step, the leaving column is shifted to the 
excess column partition, and some row of Bj\ is made an excess row. 

Without loss of generality assume that the pair being shifted is the first column and row of fin. 



DOUBLY COUPLED LINEAR PROGRAMS 



421 



The basis matrix before and after the change are given below. All elements of the matrices are identical, 
only the partitioning changes. 



(34) 



B = 



G 




H 










J 




Bn 








Br 



(35) 



B 



This is the original partitioning. The row and 
column indicated by dashed lines will be added 
to the non-key section resulting in the new parti- 
tioning scheme given below. 



*G 


*H 


*J 


*B U 






*e, 



In this new partitioning of the basis, *G has be- 
come larger by one row and one column, *B U has 
become smaller, and the other partitions change 
in corresponding ways. 



Now *B = B, but because of the changes in partition sizes, *T ¥= T and *R/R. Hence we must 
compute *Bu\ *B~\ and *V. 

*B{x is computed via standard formulas. If 



(36) 



Bn 1 



w 


X 


Y 


z 



where Z contains all but the first row and column, then 



(37) 



B\-? = Z- 



YX 

w' 



This is possible only if W ¥" which provides a criterion for choosing the row to be shifted. t 
To obtain *fi -1 recall that B= *B so 



(38) 



*» - 1 = * r p- i*r- i — * r r- in - 1 = * r p- it 1 !! - 1 



tin general, if the £th column of B ix is being shifted, then the /th row can be shifted if the /th element of row k\nBj x l is non- 
zero. There must always be at least one nonzero element in row k since Bj^ is nonsingular. 



422 



J. K. HARTMAN AND L. S. LASDON 



In partitioned form this is 



(39) 



*g-l 


-*B l *H*Br l 





*g_, 



/ 





— *y 


I 



I 





V 


I 



B-* 


-fl- '//£,' 





Br 1 



To isolate *B~* delete all but the upper left hand corner of this expression giving: 



(40) 



£-• = / 



I 





V 


I 



B* 


h 





w 


Y 






where h is the first column of — B ] HB^ ' and [W ', Y]' is the first column of B,, 1 



/ 












V 


1 





B -> 


h 





w 


Y 






where v is the first row of V. 



The vector h is computed as 



B- 1 


h 


vB l 


vh+w 



(41) h= first column of B x HBx ' 

= B~ l H (first column of 5f') 
= S 1 f/(r|F|0)' 

=fi- i //,(r|F)', 

where //i is the first partition of H. 

We note that the repartitioning here changes all partitions of B. Hence in the expression B = G — HB\ X J 
for the working basis, all four factors change. Nevertheless, we need only add a "border" to B~ l to 
get *B~K Starting from the equation *T~ 1 = *R" 1 RT" 1 and arguing as above, a formula for updating 
V is obtained 



DOUBLY COUPLED LINEAR PROGRAMS 



423 



(42) 



\ 



Vx w 



V 



Y_ 

w 







where v is the first row of V, V\ is the rest of the first block of V, and V is all other blocks of V. Note 
that *V has one less row and one more column than V and that the only nontrivial change occurs in 
the first block. 

Performing these operations repartitions the basis matrix so that the leaving column becomes 
non4cey. Applying Case 1 will then complete the basis change for Case 2c. 

CASE 3: Using the above cases the simplex method can be performed, but the working basis 
will increase in dimension by one each time Case 2c occurs. Hence it may be desirable to periodically 
repartition the basis by moving excess rows and columns into the block diagonal section, essentially 
the reverse of Case 2c. 

Suppose we have an excess row and an excess column from block j. Consider the following 
submatrices 



a 




B 



excess row 



(43) 



Bji 



jth diagonal block in B\. 



\ 



excess column 



Bringing the row and column into the key section would mean adding them to Bji resulting in a larger 
jth diagonal block. 



(44) 



*B jt 



a 


/3 


y 


Bn 



We already know fi,-, 1 and we want to compute the inverse of the new block. If this inverse exists sup- 
pose that it is partitioned as 



(45) 



Then [3] the blocks are given by 



(46) 



and the inverse exists if and only if 



*Bj? 



w 


X 


Y 


Z 



W=ll(a-pBy l iy) 
X = -WBBtf 
Y^-BtfyW 
Z = B^-B^yX, 



(47) 



a-QBrfy^O. 



424 



J. K. HARTMAN AND L. S. LASDON 



For computational purposes, note that — Bj^y is just the jth partition of the column in V corresponding 
to the column being moved. Hence to test for possible pairs to be moved requires computation of an 
inner product of /3 with a column of Vj. 

For convenience in expression, and without loss of generality, we again assume that the last row 
and column of the non-key section will become the first row and column of the key section. The ele- 
ments of *B will be exactly the same as the elements of B — only the partitions will change. The new 
working basis *B will become smaller. To find an expression for its inverse consider the expression 



(48) 



"R 



i — *»- 



B 



i __ * r 



B 



— *T-\ 



T 'TR 



Writing this in partitioned form will give the following matrix equation: 



(49) 



*B 1 


_*£-,*#*£-, 





*£_, 



/ 





_ *y 


I 



I 





V 


I 



B- x 


-fl-'flflr 1 





fir 1 



Taking only the first rows and columns of this equation gives: 



(50) 



*R-1 



B 



I 






I 





V 


I 



B 







B 



where B is just B~* with the last row and column deleted. Thus to get the new working basis inverse, 
merely delete the last row and the last column in the current working basis inverse. 
To calculate *V use the equation 



(51) 



*T-' = *R-i*B=*R- I B=*R- I RT-'. 



Then writing this in partitioned form and proceeding as above yields the result 



(52) 



-W± 



Vx-Y^ 



where V\ is the first partition of V less the last column, and V is the remaining partitions of V less the 
last column. For this to be computed we must find \\> which is the bottom row of the working basis less 
its last component £. The working basis is not carried along explicitly, so we must compute (//. 



B = G + HV from (6), 



DOUBLY COUPLED LINEAR PROGRAMS 425 

where G and H are partitions of the basis matrix B and V is updated at each iteration of the process. 

(53) [</», £] = last row of B 

= last row of G + (last row of H) ■ V 
= g+h • V (g = last row of G 

h = last row of H) 
= g+[B,0]-V 
=g+(3-Vi, 

and dropping the last element of this gives (//. 

In the above calculations the most complicated formulas were those for updating V in the two 
repartitioning procedures, Eqs. (42) and (52). An alternative procedure in these cases is to compute 
Ffrom its definition V= — B^ i J. In each case, only one partition of ^changes, say partition Vj, given by 

(54) *Vj=-*B Tl i *j j , 

with Bj\ of dimension Sj. To compute this requires sj multiplications for each nonzero column of *Jj, 
a total of at most (M-S)Xs'j multiplications. The updating calculations require on the order of 
(M — S)~Xsj multiplications. Hence they are computationally superior, but require more extensive 
program logic. 

After performing the appropriate updating procedure, we are ready to start the next simplex 
iteration. Thus the description of the basic algorithm is complete. 

5. SPECIALIZATION TO PROBLEMS WITH ONLY COUPLING ROWS 

For problems which have no coupling columns, the algorithm simplifies considerably. The major 
simplification is that no repartitioning will ever be necessary and there will never by any excess rows. 
Hence Cases 2c and 3 of the updating procedure, which are the most complicated, are never needed. 
The working basis B in (6) will always have exactly as many rows as there are coupling constraints, 
and each of the blocks Bji has dimension nij (see (1)). Relations (10), (11) for the simplex multipliers 
remain the same as do relations (16)— (19) for the transformed entering column. In updating, only Cases 
1, 2a, and 2b can occur. These remain essentially the same. The computations involving V simplify 
because each column of V except the first has only one nonzero partition. This specialization of the 
algorithm is very similar to the method proposed by Kaul [5] and is described by Lasdon in [6]. 

6. APPLICATIONS TO PRODUCTION AND INVENTORY PROBLEMS 

Consider a corporation which wishes to schedule the production of K products at L plants for T 
time periods into the future. At each plant, and for each period the demand for each product is con- 
sidered known and must be met in that period. The operation of the /th plant in the fth time period is 
hence limited by these K demand constraints and also by r constraints on locally available resources 
(e.g., plant capacity, labor) giving rise to a constraint block with K + r rows. There are LTsuch diagonal 
blocks. These blocks are coupled by constraints on scarce corporate resources which are allocated 
across the various plants and budgeted over time (e.g., corporate capital, scarce raw materials, highly 



426 J - K - HARTMAN AND L. S. LASDON 

skilled labor). In addition to producing for immediate demand, any plant may 

a) Produce a product and place it in inventory for future use. 

b) Produce a product and ship it to some other plant which has a shortage of that product. 
Each of the inventory and transportation activities gives rise to a column which couples two of 

the diagonal blocks. Thus we have a doubly coupled linear program to solve. The number of rows is 
on the order of (K+ r)LT, so truly large problems may result. There are KLT(2L + T—3)/2 coupling 
columns arising from the inventory and transportation activities, but these have very special structure. 
Each column has only a cost coefficient, a single + 1 in'the A;th demand equation for one block, and a 
single — 1 in the kth demand equation for another block. Hence they may be stored implicitly and 
priced out with minimal effort. Since these coupling activities incur a cost in addition to the production 
cost, we anticipate that even though there are many such activities, relatively few will be profitable. 
Hence in any basis there should be relatively few coupling columns so that the algorithm described 
can be applied. 

A number of computational simplifications appear. In computing the transformed entering 
column, at most two of the Zj in (16) will be nonzero, so (17) simplifies considerably. The special form 
of the coupling columns implies that in J any coupling column has only 2 nonzero elements + 1 and 
— 1, while an excess column has only one partition nonzero. Hence in V= — Bi*J a coupling column 
has at most two nonzero partitions and these are columns of the block inverses Bj\. An excess column 
has one nonzero partition in V. Hence it is probably best to compute V at each iteration instead of 
updating it. This is desirable since the most complicated update formulas are those involving V. 

7. COMPUTATIONAL RESULTS 

The algorithm described above has been coded and used to solve a number of test problems of 
the type described in section 6. The program was written in FORTRAN V for the Univac 1108 com- 
puter at Case Western Reserve University. The special structure of these problems made it possible 
to solve reasonably large programs all in core. All problems were solved in single precision arithmetic. 
Whenever a block inverse or the working basis inverse had been updated 50 times, it was re-inverted 
using a standard Caussian elimination routine. Good numerical accuracy was obtained in that different 
runs on the same problem yielded solutions which were the same to seven significant figures. The 
code was not written to be competitive with commercial routines, but rather to investigate the effects 
of various pricing and repartitioning strategies. Nevertheless the solution times recorded are encouraging. 

Data describing the test problems is given in Table 1. The notation used is as in section 6. For each 
problem size, two problems were formulated. The lower numbered problem of each pair was constructed 
to be relatively easy, in that few coupling columns appear in an optimal basis. The second problem in 
each pair is derived from the first by adjusting the right hand side demand and resource availability 
vector so that more coupling activities are required. Hence it is more difficult. For all problems, phase 
I was initiated with a basis consisting of slack variables for all resource constraints and artificial vari- 
ables for demand constraints. 

Table 2 describes the effect of three different pricing strategies. Pricing strategy one allowed 
coupling columns to enter the basis at any iteration. It led to long running times and large working 
bases since many coupling columns tended to enter the basis in phase I, even in problems for which 
there were none in the optimal basis. Pricing strategy two did not allow coupling columns to enter the 
basis in the first M iterations unless all other columns priced out optimally. It produced a substantial 
reduction in running time. Strategy three, which was the most successful, allowed no coupling columns 



DOUBLY COUPLED LINEAR PROGRAMS 



427 



Table 1. Test Problem Descriptions 



Problem 
numbers 


Number of 

products 

(K) 


Number of 
plants 

(L) 


Number of 
time periods 

(T) 


Number of 

blocks 

(LT) 


Block 
size 


Number of 
coupling 
columns 


Number of 

coupling 

rows 


Total 
problem size 


1,2 


5 


3 


4 


12 


7x22 


210 


6 


90 x 474 


3,4 


5 


3 


6 


18 


7x22 


405 


8 


134 x 801 


5.6 


5 


3 


8 


24 


7x22 


660 


10 


178x1188 


7,8 


5 


3 


10 


30 


7x22 


975 


12 


222 x 1635 


9, 10 


5 


5 


6 


30 


7x22 


975 


8 


218 x 1635 


11, 12 


5 


5 


8 


40 


7x22 


1500 


10 


290 x 2380 


13, 14 


5 


5 


10 


50 


7x22 


2125 


12 


362 x 3225 



Table 2. Effect of Pricing Strategies 



Problem 
number 


Pricing 
strategy 


Total 
iterations 


Total 
time 

(sec) 


Iterations 
(per sec) 


Maximum 

size of 

working 

basis 


Maximum 
number of 
coupling 
columns 
in basis 


1 


1 


329 


18.52 


17.8 


31 


28 


3 


178 


5.05 


35.2 


9 


3 


3 


1 


440 


36.88 


11.9 


37 


34 


2 


326 


12.99 


25.1 


22 


15 


3 


274 


9.39 


29.2 


10 


4 


5 


1 


Exceeded 60 second time limit 




2 


451 25.33 17.8 25 


20 


3 


321 14.43 22.2 15 


6 


7 


1 


Not attempted 




2 


557 38.07 14.6 27 


21 


3 


438 24.12 18.2 21 


11 



(All runs made with case 3 repartitioning attempted every 10 iterations) 



428 



J. K. HARTMAN AND L. S. LASDON 









Table 3. Effects of Repartitioning 


Strategies 


















Number of occurrences of various 


Final 


Final 


Maxi- 


Maxi- 




Repar- 


Total 


Total 




cases in updating. 




size 


No. 


mum 


mum 


Problem 


tition- 




time 


Iterations 














number 


ing 
strat- 
egy 


tions 


(sec) 


(per sec) 


1 


2a 


2b 


2c 


3 


ing 
basis 


pling 
col- 
umns 


work- 
ing 
basis 


number 
coupling 
columns 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


1 


1 


178 


4.95 


36.0 


82 


96 


25 


5 





11 





11 


3 




2 


178 


5.05 


35.2 


81 


97 


26 


7 


17 


6 





9 


3 




3 


178 


4.60 


38.7 


81 


97 


26 


7 


35 


6 





9 


3 




6 


178 


4.62 


38.5 


81 


97 


25 


9 


53 


6 





9 


3 


2 


1 


299 


7.71 


38.8 


15s 


144 


45 


9 





15 


5 


15 


7 




2 


311 


7.66 


40.6 


163 


148 


50 


17 


31 


10 


5 


11 


8 




3 


312 


7.66 


40.7 


162 


150 


48 


19 


62 


9 


5 


11 


8 




6 


311 


7.72 


40.3 


162 


149 


49 


20 


64 


10 


5 


12 


8 


5 


1 


322 


16.17 


19.9 


113 


209 


38 


9 





19 





19 


6 




2 


321 


14.43 


22.2 


108 


213 


38 


9 


31 


10 





15 


6 




3 


321 


13.49 


23.8 


108 


213 


38 


9 


64 


10 





15 


6 




6 


321 


13.46 


23.8 


108 


213 


38 


9 


174 


10 





15 


6 


6 


1 


711 


45.97 


15.5 


361 


350 


84 


26 





36 


18 


36 


23 




2 


733 


39.81 


18.4 


379 


354 


82 


63 


73 


22 


16 


25 


20 




3 


706 


38.16 


18.5 


349 


357 


76 


59 


141 


24 


18 


26 


22 




6 


675 


36.33 


18.6 


322 


353 


69 


78 


426 


23 


17 


28 


23 


9 


1 


421 


24.99 


16.8 


192 


229 


62 


13 





21 


4 


21 


12 




2 


416 


21.96 


18.9 


188 


228 


55 


34 


41 


12 


6 


17 


13 




3 


416 


21.66 


19.2 


188 


228 


55 


34 


83 


10 


6 


17 


13 




6 


416 


21.38 


19.5 


188 


228 


55 


32 


132 


10 


6 


18 


13 


10 


1 


1023 


62.14 


16.5 


638 


385 


200 


25 





33 


15 


33 


26 




2 


1034 


54.13 


19.1 


656 


378 


175 


77 


103 


21 


15 


29 


26 




3 


1025 


54.78 


18.7 


646 


379 


177 


77 


205 


21 


15 


26 


23 




6 


1023 


53.01 


19.3 


650 


373 


171 


85 


705 


21 


15 


27 


22 


11 


1 


570 


46.35 


12.3 


277 


293 


84 


15 





25 


5 


25 


14 




2 


565 


38.43 


14.7 


263 


302 


86 


39 


56 


12 


5 


21 


16 




3 


565 


38.42 


14.7 


263 


302 


86 


39 


113 


12 


5 


21 


16 




6 


577 


38.78 


14.9 


285 


292 


86 


40 


244 


11 


5 


21 


16 


12 


1 


1194 


92.68 


12.9 


748 


446 


233 


28 





38 


17 


38 


23 




2 


1209 


84.14 


14.4 


764 


445 


217 


76 


121 


23 


16 


30 


26 




3 


1208 


89.49 


13.5 


766 


442 


225 


81 


241 


25 


17 


33 


27 




6 


1231 


87.02 


14.1 


784 


447 


220 


98 


900 


25 


18 


33 


26 


13 


1 


754 


96.01 


7.9 


390 


364 


108 


26 





38 


7 


38 


19 




2 


747 


69.41 


10.8 


372 


375 


102 


55 . 


74 


14 


9 


26 


19 




3 


751 


68.44 


11.0 


377 


374 


99 


61 


150 


14 


9 


26 


19 




6 


769 


73.02 


10.5 


397 


372 


104 


72 


372 


17 


9 


30 


22 


14 


1 


1364 


137.38 


9.9 


787 


577 


258 


81 





43 


20 


43 


26 




2 


1390 


123.29 


11.3 


790 


600 


238 


79 


139 


28 


20 


33 


26 




3 


1389 


118.84 


11.7 


807 


582 


247 


82 


278 


27 


20 


34 


28 




4 


1351 


119.74 


11.3 


750 


601 


237 


89 


1351 


29 


21 


35 


27 




5 


1379 


126.08 


10.9 


789 


590 


248 


79 


414 


30 


21 


37 


27 




6 


1351 


113.60 


11.9 


750 


601 


237 


89 


1066 


29 


21 


35 


27 



DOUBLY COUPLED LINEAR PROGRAMS 429 

to enter the basis (unless necessary) until all artificial variables have left the basis. Pricing strategy 
three was used for all of the remaining runs. All problems in Table 2 were solved with the repartitioning 
procedure of Case 3 attempted every 10 iterations (see section 4). 

Table 3 shows the effects of various repartitioning strategies, i.e. strategies for employing Case 3. 
The strategies tested are to attempt Case 3: 

1. never 

2. every 10 iterations 

3. every five iterations 

4. every iteration 

5. whenever there are at least 10 excess columns 

6. whenever there are at least five excess columns. 

The different strategies often gave rise to slightly different numbers of iterations, probably because 
different strategies result in different orderings of the rows in the problem. Hence if the basis is de- 
generate, ties may be broken in different ways, and different columns will leave the basis. 

Using the strategy of never employing Case 3 involves a tradeoff. The computations of Case 3 
never have to be performed, and Case 2c is performed less often. This is because excess columns 
accumulate in the non-key section so the interchange of Case 2b is more likely to succeed. However 
if Case 3 is not performed, then the dimension of the working basis increases whenever Case 2c is 
performed, and never decreases. The overall result is that the number of iterations per second is lowest 
among all strategies tested. Hence it is desirable to repartition the working basis periodically. Among 
the other strategies tested, no consistent differences emerged. 

REFERENCES 

[1] Dantzig, G. B. and R. M. Van Slyke, "Generalized Upper Bounded Techniques for Linear Pro- 
gramming," Journal of Computer and System Sciences 1, 213-226 (1967). 

[2] Grigoriadis, M. D. and K. Ritter, "A Decomposition Method for Structured Linear and Nonlinear 
Programs," Journal of Computer and System Sciences, 3, 335-360 (1969). 

[3] Hadley, G., Linear Algebra (Addison-Wesley, Reading, Mass., 1961). 

[4] Heesterman, A. R. G., "Special Simplex Algorithm for Multisector Problems," Numerische Mathe- 
matik 12,288-306(1968). 

[5] Kaul, R. N., "An Extension of Generalized Upper Bounded Techniques for Linear Programs," 
ORC-65-27, Operations Research Center, University of California, Berkeley (1965). 

[6] Lasdon, L. S., Optimization Theory for Large Systems (Macmillam Company, New York, 1970). 

[7] Ritter, K., "A Decomposition Method for Linear Programming Problems with Coupling Constraints 
and Variables," MRC No. 739, Mathematics Research Center, University of Wisconsin (1967). 

[8] Rosen, J. B., "Primal Partition Programming for Block Diagonal Matrices," Numerische Mathe- 
matik 6, 250-260 (1964). 

[9] Sakarovitch, M. and R. Saigal, "An Extension of Generalized Upper Bounding Techniques for 
Structured Linear Programs," SIAM Journal on Applied Mathematics 15, 906-914 (1967). 
[10] Webber, D. W. and W. W. White, "An Algorithm for Solving Large Structured Linear Program- 
ming Problems," IBM New York Scientific Center Report No. 320-2946 (1968). 



NONLINEAR PROGRAMMING 
THE CHOICE OF DIRECTION BY GRADIENT PROJECTION' 



Philip B. Zwart 

Washington University 
St. Louis, Missouri 



ABSTRACT 

Rosen's method of Gradient Projection chooses a search direction which is not neces- 
sarily the direction of steepest ascent. However, the projection of the gradient onto a "suit- 
ably chosen subspace" does yield the direction of steepest ascent. The suitable choice is 
easily recognized as a result of some theorems relating gradient projection to steepest ascent. 
These results lead to a modification of Rosen's method. The modification improves the choice 
of search direction and usually yields the steepest ascent direction without solving a quadratic 
programming problem. 

I. INTRODUCTION 

The general continuous nonlinear programming problem can be stated as: 

Find x*=(x*, . . ., x*) which yields the maximum value of F(x) among all x which satisfy 
<f>i(x) 2*0, i=l, . ■ ., A-., i.e., 



(1) 



MaxF(x)., 

subject to: </>,(*) ** 0, i = 1, . . ., k. 



The F and $i's are assumed to be concave and twice continuously differentiable. Concavity is 
usually required for theoretical discussions of convergence. Numerical approaches to solving (1) can 
often be successfully applied even if concavity is lacking. This is true for the material discussed herein. 

Gradient Projection, first discussed by J. B. Rosen [2], [3], is a numerical iterative procedure for 
solving (1), (or attempting to solve (1)) by a sequence of one dimensional searches made in the direction 
of the "projected gradient." 

This paper indicates a weakness in Rosen's method, gives some mathematical results relating 
gradient projection to steepest ascent, and describes a small modification, which follows naturally 
from the mathematical results and overcomes the indicated weakness. 

Section II is devoted to notation used throughout the paper. Section III gives a rough description 
of Rosen's method and a detailed description of those aspects with which this paper is concerned. 
Section IV shows that all "active" constraints should not necessarily be used in projecting the gradient. 
Section V reviews some linear algebra which may make the mathematical proofs more lucid. Section 
VI presents two theorems which relate the "suitably projected" gradient direction to the direction of 



*This research was supported in part by the Atomic Energy Commission under Research Contract No. AT(1 1 — 1 )— 1 493 . and 
by the Department of Defense under THEMIS Grant No. F44620-69-C-0116. 

431 



432 p - B - ZWART 

steepest ascent. Section VII presents the suggested modification to Rosen's method. Section VIII re- 
lates this material to the methods of feasible directions discussed by Zoutendijk. Concluding remarks 
are given in Section IX. 

II. NOTATION 

Let F and <£,, i— 1, . . ., k be as described in Section I. Then VF(x), V </>,(*) denote the respec- 
tive gradients at the point x= (x\, . . . , x n ). 

If I q ={i], . . ., i q } is a set of indices taken from 1, . . ., k, then P q {x) denotes the projection 
operation which maps any vector into its projection onto the subspace perpendicular to the subspace 
spanned by the V (/>,(*), ielq. For instance, P q (x)\/F(x) is the projection of VF(x) onto the subspace 
perpendicular to the subspace spanned by the S/tpiix) , ielq. A unit vector in the direction of P q (x)S7F(x) 
is denoted by d q (x). Finally, in cases where the argument x is obvious, the "(x)" is omitted. Thus, 
VF, V (/>,-, P q , d q are used in place of VF(jc), V (/>,(*), Pq(x) , d q (x). 

III. ROSEN'S GRADIENT PROJECTION METHOD 

Rosen's method of choosing a search direction is described below. It is assumed that the feasible 
point x is given (this point is usually the result of a one-dimensional search starting from some other 
point). We are only concerned with how a direction is chosen for the next one-dimensional search. Let 
I q —i\, . . ., i q be the set indices i for which |</>,-(;c)| < 8,. 

The Si's are preassigned small numbers. It is assumed that the V<f>i(x), ielq are linearly inde- 
pendent vectors. Rosen's Gradient Projection method chooses the search direction as follows: 

We have P q {x) VF(x) = VF(x) - £ r,-V<M*)- 

k 'q 

(Note: The vector r with components n, ielq, is given by r= (N q N q )~ 1 N' q ^7F(x) , where N q is the 
matrix whose rows are the vectors V</>i(.v), i^q-) 

1. If \\P q {x) S7F(x) || is "very small" (described in detail in Rosen's papers) and n =S 0, iel q , then 
do not search any more. 

2. If ||P Q (;t)VF(x)|| is not "small" (described in detail in Rosen's papers), then use the direction 
of P q (x)S7F(x) as the search direction. 

3. If \\P Q (x)S7F(x)\\ is "small" (described in detail in Rosen's papers) and some r, > 0, ielq, then 
drop the index / from the set I q to obtain a new set I q -\. Use the direction of P q -i, (x) VF(ac) as the 
search direction. The index / is determined by 

WMx)\r™™\WMx) 

The details of what is meant by "very small" and "small" are not included here because the 
discussion in this paper is independent of these details. This paper is concerned with steps 2 and 3. 

Steps 2 and 3 say that / is dropped from the set of indices I q , only if \\P q (x)'\7F(x)\\ is small. 

Actually, as will be shown below, a better search direction (i.e., one which yields a greater rate 
of increase of F) can be obtained by dropping /, whenever 

n 

max 



llv<M*)ll w f 



l|v<M*)||j 



is positive. 



NONLINEAR PROGRAMMING GRADIENT PROJECTION 



433 



IV. ACTIVE CONSTRAINTS 

Fig. 1 illustrates a situation in which the projected gradient direction, while acceptable, is certainly 
not the best choice of search direction. Clearly, a greater rate of increase in F is obtained by searching 
in the gradient direction. In such a situation, Rosen's method would choose the poorer search direction. 



FEASIBLE REGION 



P,(X)VF(X) 




Figure 1. A projected gradient. 

Fig. 2 illustrates another situation in which Rosen's method would make a poor choice of search 
direction. The vector, P->(x)VF(x), the projection of S/F(x) onto the intersection of planes 
Xi — 3.T 2 + .t 3 = 18 and 3xi — x-z + x 3 = 20 would be chosen as a search direction by Rosen's method. 
Clearly, Pi(x)VF(x), the projection of S7F(x) onto the plane 3xi — ^ 2 + ^ 3 = 20 gives an acceptable 
search direction which yields a greater rate of increase in F. 



X 3 "X|-X 2 +IOX 3 slOO 

VF(X) 




FIGURE 2. Alternate projections of gradient 

The situation in Fig. 1 may seem unreasonable because one would not expect some previous one- 
dimensional search to stop at x. However, the situation in Fig. 2 is reasonable. The point x could very 
well be the result of a previous one-dimensional search along x\ — 3x 2 + x 3 = 18. Such a one-dimensional 
search would stop at x because further searching would violate the constraint — 3*i + x-> — x.\ 3= 20. 



434 p - B - ZWART 

Let us designate a constraint, 4>i 2= as "active" at x, if |0,-(.v)| =£ 8,-, where the Si's are preas- 
signed small numbers. In other words, the ith constraint is considered active at x if the hypersurface 

<4, = appears to be close to x. (A better estimator of closeness would be . ' ' ■ . -, But this is not 

\\\/4>i{x)\\ 

the primary concern of our discussion.) Rosen's method chooses the next search direction by taking 
the projection of \7F{x) onto the intersection of the hyperplanes which are tangent to the surfaces 
<}>i(x) = (f)i{x), for all active constraints (except when this projection is "small"). The examples in 
Figures 1 and 2 indicate that sometimes it is better to take the projection of S7F(x) onto an intersection 
of a subset of these tangent hyperplanes. The remainder of this paper describes how to recognize this 
situation and what to do about it. 

V. PROJECTIONS 

Let us recall a few facts from linear algebra, which are used in the proofs of Theorems 1 and 2. 
Let W\ D W-i be linear subspaces of £". Let Pi and P 2 be the projection mappings associated with 
W\ and Wi respectively. For any vector A, 

(1) PiA PiA=PiAA ^AA, i=l, 2 with < holding whenever A +PA. 

(2) P,P l A = P 2 A. 

(3) For any vector BeW\, 

A • P A A • B 

no a\\ ^ iidii with = holding only when B = aP\A, a real and positive. 

\\PiA\\ IpII 

These facts follow readily from the fact that A —PiA is perpendicular to Wi, i=\, 2. 

VI. STEEPEST ASCENT VS PROJECTED GRADIENT 

The steepest ascent direction at a point x is given by the unit direction vector, y, which maximizes 
the rate of change of F{VF(x) • y) and does not violate the hyperplanes which are tangent to the 
active constraint surfaces (i.e. V <£,-(.*) -y 3= 0, for active </>,-'s). Thus, the steepest ascent direction is 
given by that vector d which solves the following nonlinear programming problem: 

2 MaxVFUW 

subject to: S7(f>i(x) -d 3= 0, for (f>, ^active, i.e. iel d-d=l. 

This direction is the locally best search direction in the case of linear constraints. (Of course, any 
one-dimensional search in this direction must be accompanied by small corrections to insure that the 
nonlinear constraints, $, 5* are not significantly violated during the search.) 

Hadley [1, p. 301] mentions that although the solution to (2) can be obtained (by solving a quad- 
ratic programming problem), simpler procedures can be used to find a d which satisfies the constraints 
in (2) and has VF(x) • d> 0. Gradient Projection is one of several methods for finding such a d. The 
following theorem indicates that the solution to (2) is given by the projection of the gradient onto a 
suitable subspace. 

THEOREM 1: The solution to (2) must be a unit vector having the direction of P q (x)S7F(x) for 
some /,= {i\, . . ., i q ). 

PROOF: Let d be the solution to (2). 

PaVF 



Set I q = {i\\/4>j-d — 0}, and d q - 



linVFH 



NONLINEAR PROGRAMMING GRADIENT PROJECTION 435 

We will show that d—d q . Suppose not. Then, since d is a unit vector with S/(f>i-d — 0, iel q , we have 
by V(3), 

(3) VF-d = P Q VF-d<P Q VF--^^ l <P q VF-d q =\7F-d q . 

Now, for any y > 0, 

V(f> r (d+yd q ) = 0, iel q . 

And, since V(f>rd> 0, i^I q , y can be chosen small enough so that 

\/(f)i-(d + yd q ) >0, i$I q . 

Let y > be such a choice of y. Then, d= ,, , . - ^.i is a feasible vector for (2). 

\\d+yd q \\ 

Moreover, 

This last inequality contradicts the fact that d is the optimal solution to (2). Hence, we must have that 
d=d q . Q.E.D. 

Theorem 1 implies that (2) can be solved by suitably determining I q . If we assume that the V$,'(x), 
iel (i.e., the gradients of the active constraints) are linearly independent then the next theorem char- 
acterizes the proper choice for I q . 

THEOREM 2: P q (x)VF(x) (=VF(*) -]£ T$7<f>i(x)) gives the direction of the solution to (2) 
if and only if ' e/ <? 

i) P q (x) VF(*) • V<M*) > 0, if/,, and 

ii) n =£ 0, te/,. 

P VF 
PROOF: Set d,-- 



r,v/ni 

Sufficiency. Let d be any feasible vector for (2). Then, by use of the equation in the statement of the 
theorem, 

VF-d = P q VF-d+^ nV<t>i-d 

=s P q VF ■ d 

*£ \\P„V F\\= Ptf F ■ d q = V F ■ d q 

Therefore, d q maximizes VF • d. 

Necessity. Condition (i) is necessary because d q must be a feasible solution to (2). 

Suppose {ii) does not hold. Let jel q be such that r, > 0. We will show that there is a feasible J for which 

VF-d q <VF-d. 

First, let /,- = {iel\ V0, ■ F„VF = 0}. 



436 p - B - ZWART 

Notice that P q .VF = P q .P q V F = P q VF. 

The first equality follows from V(2) and the fact that I q C I q '. The second equality follows from 
the method of constructing I q *. 
Set I q '-\ = {i\iel q ' and i + j}. We claim 

(a) S7(f>j ■ dq.-i 2* and (b) VF ■ d q >-i > VF ■ d q . 
To see (a), we observe that 

P q ^F ■ P q KF=P q .-NF ■ P q .VF 
=P q ,- l VF-P q VF 

-P^-.VF-VF-^ nP q '-iVF • V0i 

= F 9 ._,VF • S/F-rjPg'-i^F ■ V<fo. 
= P,._ 1 VF-/ > ,.-,VF4-r / P ff ._,VF-V0 j 

i.e. ||P,.VF|| 2 =||P,'-,VF|| 2 -rjP,._,VF-V0j. 

i.e. P,--iVF-V0j = - (IIFq'-jVFU'^-IIF^VFll 2 ) 2=0. Where this last inequality follows from the 

fact that r, > and /q- C I q >-\. Assertion (a) now follows because rfq'-i is a unit vector in the direction 
of P^-.VF. 

To see (b), first notice that if P 9 '-i VF = F 9 'VF, then we would have 

VF- ^ s,V<£; = VF-2 nV^i 
i.e., rjV<f>j(x)= £ s,-V0,— £ rrV^-rj + O. 

f+j i*j 

This last statement contradicts the linear independence of the V </>,-. Thus,P 9 '_iVF 4= P Q '^7F, and since 

P q .VF = P q .P g .-iVF, by V(2), 

we must have ||P,-_,VF|| > ||P,-VF||, by V(l). 
But ||P,VF|| = P.-VF ■<*,-, i= 9 ', 9 '-l. So 

P,-_,VF-d,'-,>P 9 '_,VF-d r -, 
i.e.,VF-d,<_, >VF-d,- 
But d Q ' = d q because, as noted above, P q 'S/F = P q S7F. This gives us (b). 



NONLINEAR PROGRAMMING GRADIENT PROJECTION 437 

From the definition of /,,< and I q '-\ it follows that V</>, ;■ d q = 0, iel q ', and V$, ;• d q >-\ = 0, ieI q <-\. 
This, combined with assertion (a), says that, for y > 0, 

V (/>,•• (d, + yd,._i) 2=0, te/q'. 

Then, since V</>, • <f q > 0, iel — I q ', we can choose y = y small enough so that V<£, ■ (d q +yd Q '-\) > 0, 
iel — l q >. Thus 

j d g + yd q <-i 

d= n,j _l-^ FT 

is a feasible vector for (2). 
Also, using (b), we have 

V ^ • d= I. , , h FT ( VF • d q +yVF ■ <*,-_,) 

||rfq+ya 9 '-i|| 

> .I , * fi (l + y)VF-d, 

> VF • <f 9 . 

We have a contradiction to d q being a solution to (2). This completes the proof of the necessity. 
Q.E.D. 

VII. MODIFIED GRADIENT PROJECTION 

This section describes a simple modification to Rosen's method of gradient projection. This 
modification helps to overcome the deficiencies which were described in Section IV. The modification 
is justified by the following corollary to Theorem 2. 

COROLLARY TO THEOREM 2: If /„ = / and P 9 VF = VF-^ r,V0 ( , where some rj > 0, then 

it l 

d q -i(I q - l =I q — {j} is a feasible solution to (2) with VF ■ d q -i > VF ■ d q . 
(i.e., P q -iS/F gives a better search direction than F q VF.) 

PROOF: This result follows from the proof of the necessary condition in Theorem 2. 

The above corollary suggests the following modification to the method of gradient projection. 
Steps 2 and 3 of Section III should be replaced by the following step: 

2'. If some r, > 0, iel q , then drop the index / from the set I q to obtain I q -\. Use the direction of 
P 9 _iVF(x) as the search direction. The index / is determined by 

ri [ r, 

max 



|v</>,u)ll "«7;ilrv<M*) 



This modification gives an improved search direction in the sense that it gives a direction of 
greater rate of increase. In addition, if the recalculated r,'s are all negative, then a "steepest descent" 
direction (the solution to (2)) is obtained. The termination of a search iteration usually involves the 
encountering of no more than one new constraint surface. Thus, permitting one index to be dropped 
should usually maintain the r,'s as negative. Of course, this will not always be the case. 



438 p - B - zwart 

VIII. Zoutendijk's Discussion 

G. Zoutendijk [4] discusses several algorithms for handling nonlinear programming problems. 
The algorithms are called "Methods of Feasible Directions." These algorithms fall into the general 
framework of a sequence of one-diminsional searches. They differ primarily in the method of choosing 
the search direction. The choice of search direction involves the use of a normalization. Different 
normalizations yield different search directions and, consequently, different algorithms. 

Zoutendijk's normalization Nl [4, p. 70] together with a choice of 0, = O, iel c (xi,-), for determining 
S'(xk) [4, p. 68, Eq. 7.3.6] yields a search direction which corresponds to the direction of steepest 
ascent. Of course, choosing 0, = O, does not fit in with Zoutendijk's methods because he chooses to 
avoid the necessity of calculating correction steps during the one-dimensional search. Nevertheless, 
Zoutendijk's discussion of normalization Nl is closely related to the results of this paper. 

In his discussion of normalization Nl [4, section 8.2, p. 80], Zoutendijk mentions that the re- 
sulting direction finding problem is of the same type as (2) above [4, p. 81, Eq. 8.2.3]. Thus, his results 
and comments are applicable to the solution of (2). He discusses four methods of solving this problem. 
Two of the methods are finite and are based on the work of Charnes and Dantzig, and Orden and 
Wolfe. The other two methods devised by Zoutendijk are not finite, but are expected to be usually 
more efficient. This serves to illustrate that there is evidently no easy way to solve (2). Thus, the small 
modification to gradient projection proposed in section VII of this paper can be considered as a compro- 
mise between either solving (2) exactly or accepting the first gradient projection as proposed by Rosen. 

IX. CONCLUSION 

The direction of steepest ascent in constrained problems has been shown to be the direction given 
by the projection of the gradient onto a suitably chosen subspace. Moreover, the suitable choice is 
recognized by the negativity of certain coefficients which arise during the calculation of the projected 
gradient. 

The above results give rise to a slight modification to Rosen's gradient projection method. The 
modified projection method usually yields a search direction which is the direction of "steepest ascent." 

The modified method when employed has the following favorable qualities: 

1) The new projected gradient direction is better (has a greater rate of increase of the objective 
function) than the original projected gradient direction. 

2) The number of constraints being followed is reduced. Thus calculations of correction steps and 
future projections are simplified because the matrix N%N q (See [3]) is smaller. 

REFERENCES 

[1] Hadley, G., Nonlinear and Dynamic Programming (Addison-Wesley, Reading, Mass., 1964). 

[2] Rosen, J. B., "The Gradient Projection Method for Nonlinear Programming, Part I," SIAM Journal 

8, 181-217 (1960). 

[3] Rosen, J. B., "The Gradient Projection Method for Nonlinear Programming, Part II", SIAM Journal 

9, 414-432 (1961). 

[4] Zoutendijk, G., Methods of Feasible Directions (Elsevier Publishing Company, Amsterdam, 1960). 



SIMPLE STOCHASTIC NETWORKS: SOME PROBLEMS AND 

PROCEDURES* 

J. M. Burt, Jr. 

University of California 
Los Angeles, California 



D. P. Gavert 

University of California 
Berkeley, California 

and 

M. Perlas 

Carnegie-Mellon University 
Pittsburgh, Pennsylvania 



1. INTRODUCTION 

An extensive literature now exists describing and solving various problems related to aspects of 
project graph analysis. Attention is called in particular to the papers of Charnes, Cooper, and Thompson 
[2], Hartley and Wortham [5J, Kelley [7], Martin [9], and Jewell [6], although there are many others as 
well. We mention also the systematic book-length treatment of Moder [10]. 

Initially, project graph analysis (also mnemonically called PERT, GERT, CPM, etc.) assumed 
that individual task completion times were fixed and known in advance. The unreality of such an as- 
sumption in many contexts is apparent. Consequently attempts to introduce the probability distribu- 
tions of completion times have been made; noteworthy for such work are [2], [5], and [9] above. Thus, 
for example, one may be interested in the expectation or other summary of the entire project's comple- 
tion time as the latter depends upon the distributions of the inidvidual task times. Alternatively, it 
may be of interest to seek the probability that a particular path will be "critical," i.e., will be the last 
to be completed. Decision problems of one sort or another may be considered; cf. [2] and [6]. 

This paper deals with stochastic network problems of the sort just mentioned. Since most stochastic 
networks of realistic scale are practically impossible to analyze mathematically, we begin by discussing 
simulation, in particular illustrating the gains possible by use of Monte Carlo approaches for improving 
naive "straightforward" simulation techniques. Considerable improvement in simulation efficiency 
can be made if an approximate, analytically tractable, model is available, so we next introduce such 
models and illustrate their use as a "control variate." The analytical network models suggested utilize 
families of task length (link time) distributions based on the simple exponential density, and profit 
by the Markovian or memoryless property associated with the exponential. 



*This research was conducted in part under National Bureau of Standards Contract CST 136 at the Operations Research 
Center, University of California, Berkeley. The work of the authors was also sponsored by the Management Sciences Research 
Group, Carnegie-Mellon University, Pittsburgh, Pennsylvania. 

tVisiting faculty member during the year 1967-1968. 

439 



440 J- M. BURT, Jr., D. P. GAVER, AND M. PERLAS 

2. SOME MONTE CARLO SIMULATION TECHNIQUES 

It seems safe to say that only a minority of the actual networks encountered in practice can be 
adequately studied using pencil and paper mathematics. This is not to state that analytical approaches 
are useless: they can, and should, be used to reduce the complexity of parts of the network prior to a 
computational approach, and they can offer plausibility checks. Ideally, one would like to obtain 
enough insight into the behavior of stochastic networks so as to largely eliminate the need for formal, 
perhaps computerized, studies. One way to obtain such insight is to examine various realistic networks 
by means of sampling experiments, using a computer. Since this is also a popular way of solving specific 
problems, a discussion of various techniques is given next. Our intention is to describe certain Monte 
Carlo methods for efficiently dealing with the randomness present. The general purpose is to obtain 
good determinations of, say, project completion time expectations or distributions while keeping 
computational effort low. The methods we discuss were first put to use in physics; see the book by 
Hammersley and Handscomb [4]. 

O : -O : KD 

Figure 1. 

A. Antithetic Variables 

To explain this notion, refer first to the network of Figure 1 and suppose we wish to estimate the 
expected completion time of T = T\ + T> by simulation, where T\ and T> are assumed to be independent. 
The discussion which follows is valid for any size network composed of activities whose durations are 
independent. This small example has been chosen solely to simplify the presentation. The straight- 
forward procedure is to select two uniform random numbers, R\ and R>, transform to realizations of 
T\ and T-z, perhaps by the familiar procedure, 

T, = Fr(R), 

where F, is the distribution of T-„ and Ff l its inverse. Then the first realization is 

7x1):= yd) + jKi)- 

we could tabulate n such realizations using random numbers and average to obtain the estimate 

fU) _|_ yto _|_ y(2) _|_ f(D _|_ _|_ fin) 4. j(u) 

(2.i) r=-! = ! *- '— ! 2 — 

All random numbers are independent, and the n times are generated from their appropriate 
distributions, so 

(2.2) E[T]=E[Ti] + E[T 2 ] 
and 

(2.3) Var [f ] = Var [r,] + Var [r 2 ] 



SIMPLE STOCHASTIC NETWORKS 441 

In other words, the simple procedure described gives an unbiassed estimator of E[T] whose variance 
decreases as 1/n. If the procedure described is repeated many times (n-^> °°), then T becomes 
arbitrarily close to E[T]. Unfortunately, if there are many independent serial links then the sum of the 
variances in the numerator of (2.3) becomes large, and many repetitions are required in order to deter- 
mine E[T] accurately. An attempt to avoid this is in order. 

Notice first that in order to estimate E[T] the realizations T\ j) + Vip and T^ ) + T^ need not be 
independent so long as they have the correct marginal distributions. Intuitively, too, one sees that if, 
when T\ is "large" in one realization, it is forced to be correspondingly "small" in another, then the 
average will tend to be closer to the true value E[T] than in the case of purely independent samples. 
To accomplish this, we can construct two realizations of T= T\ + T> using the same two random num- 
bers: first obtain R t and R>, thence T\ and T>, and finally T; next subtract Ri and R> from 1 to obtain 
1 — R, = Rl (i = 1, 2)*, and from these T[ and T'i, finally adding to get T . T\ and T-> are the antithetic 
variates of T\ and T>. Lastly, average: 

yd) _|_ yd)' _|_ yd) + yd)' + _|_ y(») _|_ y(»)' _j_ fin) + fw 

(2.4) T A - - 1 ! 2 - ^~ 



2n 

yd) _)_ j(i) _|_ _ _ _|_ f(n) _)_ y(«) yd)' _)_ yd)' _)_ _ _ _|_ y< »)'_)_ y( H )' 



= -(T+T'). 
2 V ' 

Now by construction, E[f] =E[T '] =E[T], so the estimate f A is unbiassed. Furthermore, 

(2.5) Var [f] = Var [f , ] = Var [7,] +Var [P,] 

n 

However, it should be intuitively apparent that t and t' tend to be negatively correlated, i.e., 
cov [T, T'] < 0. (This property may be proven easily.) Since 



Var [f,]=|(Var [f]+Var [f])+|cov [T, f ] = WaT W +1 cov [f, f ] < Var [f] , 



(2.6) 



this means that the above simple procedure is more efficient than doubling the total number of in- 
dependent realizations computed. One can, of course, estimate the variance of Ta by simply computing 
the sample variance of the n independent averages 

y(o + yo)+y(0' + y(0' 

*-— — (*=1, 2, . . .,»). 

Since Ta is the average of n independent terms, approximate confidence limits may be placed on 
isjT] using the Student t tables. 



The use of a few special random variate generators, such as the Normal variate generator using trigonometric functions, 
should be avoided when the antithetic procedure is applied. 



442 J- M. BURT, Jr.. D. P. GAVER, AND M. PERLAS 

The dramatic effect of antitheticizing a sum is most apparent when Tj is symmetric, e.g., if T, 
can be assumed uniform, or normal. In the latter case we can write 

(2.7) 7f=mi+ar£i, 

£; being distributed as N(0, 1); m, and cr, are mean and standard deviation of Tu Then the obvious 
antithetic is obtained by simple sign reversal of £,: 

(2.8) T; = mi-cnZi, 

and an average ot — with already yields a zero-variance estimate of m,. 

For the latter example, and for others as well, it should be apparent that the use of antithetic vari- 
ables is helpful in efficiently establishing the appearance of a distribution function. One first determines 
an empirical distribution, F(t), from the TVs, and a second empirical distribution, F'{t), from the 
TV's. The antithetic variate estimate of the true distribution would then be 



,2.9, /(,) = *" + /'"> for.]],. 

If F(t) is overly skewed (relative to the true distribution) to one side, then F'(t) is likely to be overly 
skewed to the other side, and F{t) will be close to the true unknown distribution. It should be noted 
that estimates of Var (T) should not be formed by simply averaging the sample variances of the 
antithetic variates; such an estimate may be quite biased. The authors have had some success in elimi- 
nating this bias by application of the Tukey-Quenouille jackknife procedure. 




Figure 2. 

As a second example of the use of the antithetic technique consider a simple parallel network, 
e.g., Figure 2. The goal is to determine the probability with which M=max (7\, T>, . . ., 7*) is not 
greater than some number x. For simplicity, let all 7V s have the same rectangular distribution: T, is 
uniform on [0, 1]. This is no restriction in fact, for, even if the 7Vs were arbitrarily but identically 
distributed, M =S x if and only if R, =s y, for all i. 

To determine the above probability in a straightforward manner we can sample independently 
for each component link. Then tabulate 8, where 



{1 if Ri<x, R><x, . . .,Ri,<x 
[0 otherwise. 



SIMPLE STOCHASTIC NETWORKS 443 

Letting 8 (j) refer to theyth realization (j—1,2, . . ., n) , the estimate of the desired probability is 

(2.10) g= «■■>+«»>+■ . ■+«'-» 

ft 

It is simple to see that 

(2.11) £[§]=*'• 
and 

_ x k (l — x k ) 

(2.12) Var [8] = * U * J . 



The ratio 
(2.13) 



Var [g] _jc~ a — 1 
£ 2 [S] « 



measures the fractional or percent error of the estimate. Plainly a large k tends to make the above 
error measure large, requiring an enormous n to obtain a good determination. 

An antithetic approach might go as follows. With 8 associate the antithetic indicator 8 ', where 

(214) g riif U-/?.)<*, (1 -«*)<*, . . ., (1-R k )<x 

[0 otherwise. 

Obviously 8 and 8' cannot be unity simultaneously if x < 1/2; for simplicity x < 1/2 will be assumed. 
Now form the antithetic estimate 

, oie , - _8< 1 > + 8< 1 >' + S< 2 > + 8< 2 >'+ . . . +8<"> + 8<">' 

(^•lt>) OA ^ 



Clearly 8 + 8' is unity with probability 2x'\ Hence 

(2.16) E [h.,]=^P^ = x>\ 

Zn 

so the estimate is unbiassed. Furthermore 

(2.17) Var [8,] = ^ILZL 2 ^!!. 

Zn 

The proportional error of the estimate is 

(2 18) Var [8,,] = x~ k -2 1 Var [8] 

E*[8 A ] '2n "2 £2 [s] 

Of course the above antithetic procedure is merely one possibility. Another is to antitheticize on 
some, but not all of the links simultaneously. There are in all 2'' such possibilities. 



444 



J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS 



B. Stratification 

The Monte Carlo technique, called stratification, is an extension of the concept underlying 
antithetic variates, i.e., the creation of parallel, negatively correlated, realizations that are to be 
averaged. To simplify the exposition, let us initially assume our network contains a single activity 
whose time-to-completion is denoted by T%. As mentioned above, realizations of T\ are created by taking 
uniform random draws over the unit interval and transforming them, usually via the inverse distribution 
function. "A>way stratification" is performed in the following manner. Divide the unit interval (0, 1) 
underlying realizations of T\ into k disjoint and exhaustive subintervals; as a practical matter, one 
usually uses subintervals of equal lengths although other choices are available (See [4, p. 56]). Draw a 
uniform random number, R, and compute k parallel realizations of T\ from the random numbers, 
R/k (l + R)/k, . . ., (k—l+R)/k. Randomly assign these realizations of T\ to k parallel simulations. 
Repeat the procedure by drawing a second random number, and so on. 

The above procedure leads to k parallel sets of realizations of T\, and it can easily be shown that 
within any one of these sets, realizations of T\ are independent. However, between the parallel sets of 
7Vs there may be considerable negative correlation. Thus, if one set has an unusually large number of 
low realizations of T\ (i.e., lower than E{T\)), some other set(s) will have a correspondingly small 
number of low realizations of T\. By taking averages across the sets, low variance estimates, of say 
E{T\), may be obtained. 

For networks containing many activities, stratification may be performed on some or on all of the 
activity times and different A;'s may be chosen for different activities. There is considerable "art" 
involved in selecting which activity times should be stratified and how detailed (choice of k) such 
stratification should be. Generally, for activity times which have relatively large variance and activities 
which have a high probability of lying on the critical path, one would use fairly detailed stratification 
(about 4- or 5-way). 

C. Control Variates 

Suppose we are interested in estimating the distribution function, or a parameter thereof, of the 
completion time of Network I below. It would be extremely arduous to compute the completion time 
distribution of that network by analytic methods. The control variate procedure requires that we 
construct a simplified network "similar" to Network I, such as Network II (see Figure 3). 





NETWORK I 



NETWORK H 



Figure 3. 



Under certain assumptions (see Section 4) concerning the link time distributions, Network II may 
be analyzed by exact methods. The general idea of the control variates technique is to exploit the 
similarity between I and II in order to improve estimates concerning I; in doing so, use is made of the 
exact knowledge of II. In other words, simulation is used to correct a known result for II to bring it 
close to I. Again the key to the correction procedure is to reuse the basic random numbers to create 



SIMPLE STOCHASTIC NETWORKS 445 

parallel realizations. However, this time the attempt is made to achieve a high positive correlation 
between the behaviors of I and II. 

To clarify these ideas, suppose we are interested in estimating the expected completion time, 
E[Mi], of Network I. Suppose further, that the distribution functions corresponding to activity times 
in Network I, except that of T 7 , are amenable to mathematical computations, e.g., exponentials, uni- 
forms, gammas, etc. We might construct the control Network II as follows: Let Ta = E[T\] +Ti; 
T B = E[Ti]+T s ; T c = a random variable highly (positively) correlated with T 7 , but possessing a simpler 
(easier to manipulate mathematically) distribution function than that of T 7 ; let T>, T$, and T$ be as in 
Network I. We then run parallel simulations of the time to complete each network using the same set 
of random draws for corresponding link times; i.e., when we draw a random number to compute 7\, 
we use the resulting value in computing T.\ and 7V Similarly, we use the same random draw in com- 
puting T 7 and 7V. This then implies that a positive correlation tends to exist between Mi and M//. Notice 
that since 

(2.20) E[M I ]=E[M„]+E\_M I -Mu], 

it is possible to estimate E[Mi] by computing E[Mn] analytically, and then estimating E[M, — Mn] 
from a sampling experiment. The latter step involves obtaining n realizations: Mj'^ and Mjp (t=l, 
2, . . ., n) and averaging: 

1 A 



(2.21) M l -M„ = ± '2(M}»-M}j 



Since the link times shared the same random numbers, M\ n and M\'/ are positively correlated. Thus 

(2.22) Var [M,-M„] =~ { Var [Mi] + Var [M„] ~ 2 cov [M,, M„] } . 

n 



Hence, it follows that if 

(2.23) Var [M„] < 2 cov [M,, M„], 

a reduction in variance over a straight simulation has been achieved. Of course the degree of the 
reduction depends upon the skill with which the approximating network, II, is selected and analyzed. 
Usually there is some freedom for choice of II and the latter will have many fewer links than I. One 
method for forming the control variate is to use a subnetwork of the original complex network. The 
subnetwork can be formed by deleting those activities that have low probabilities of being on the 
critical path. 

Certain natural extensions of this method suggest themselves: 

(a) As described, the control variables method applies straightforward Monte Carlo to estimate 
the correction to E[Mu], i.e., the second term of (2.20). Sometimes this estimate can be further im- 
proved by supplementary use of antithetic or stratification techniques; see Example 4 in section 3. 



446 J- M - BURT, Jr., D. P. GAVER, AND M. PERLAS 

(b) There is no reason to restrict attention to one approximating network. Suppose another III 
is introduced. Then 

(2.24) E[M,] =w 1 E[M„] +w 2 E[M,„]+ Wl E[M, - M„] + w 2 E[M, - M m ], 

where the weights w\ and w-ziwi + w 2 = 1) may either be chosen equal to one-half, or, perhaps better, 
estimated by making use of the regression technique, see Hammersley and Handscomb [4, p. 66]. In 
the latter we seek W\, w-i to minimize 

(2.25) Var OiAi + ma>A 2 ] = w\ Var A, + 2mw 2 cov (A,, A 2 ) + w?g Var [A 2 ], 



where W\ + w> = 1 and Ai = Mi — Mn, A 2 — Mi — Mm- Although the latter variances and covariances 
are unknown, they may be estimated from the sequences of individual realizations {M ( p — M { })} and 
{A/'/' — M\'^}. It seems possible that bias introduced by such a procedure for finding weights may be 
effectively reduced by application of the Tukey-Quenouille jackknife procedure. Finally, there is no 
necessity for limiting the analysis to two approximating networks. It should be mentioned that one very 
simple — and traditional — control variate derives from the original PERT procedure of finding that path 
through the network the sum of whose expected link time is longest. Then all other paths are ignored 
and the distribution of network completion time is taken to be normally distributed with mean equal 
to the sum of the means of the activities on the above "critical path," and with variance equal to the 
sum of the variances. MacCrimmon and Ryavec [8] and Van Slyke [11], as well as others, have examined 
the bias of such an approximation; the control and antithetic variables procedures advanced here 
should be capable of considerable improvements over this PERT approximation. 

(c) The various techniques mentioned have emphasized the determination of the mean or expected 
time to complete a PERT network; however, they apply equally well to the estimation of the expecta- 
tions of other functions. One such is 

/ o^ ( 1 for M 3= x 

(2.26) <P(M) = 

I for M < x; 

the expectation of the latter is just the probability that M < x, which may be of use. In the event that x 
is rather large (or small), then application of the method of importance sampling may be of use in 
improving precision. 

The success of the three Monte Carlo techniques discussed above will vary from network to net- 
work. Indeed, even on the same network, the success of the techniques will vary from experiment to 
experiment due to randomness of the draws generated for an experiment. In the section to follow 
several examples of these techniques are illustrated nuiherically so that the reader may get a feel for 
the improvements that are "typical." In the process, control networks involving exponentially dis- 
tributed link times are employed. Methods for handling such networks analytically (by Markov property 
and Laplace transform) and numerically (by numerical transform inversion) are relegated to a later 
section. 

3. EXAMPLES 

To make a beginning, the following very simple networks, Figure 4, were treated by means of the 
antithetic and stratification procedures. 



SIMPLE STOCHASTIC NETWORKS 



447 



EXAMPLE 1: 

NETWORK I (WHEATSTONE) 




-K> 



NETWORK 2 (SERIES) 



-*C^-K> 



NETWORK 3 (SERIES -PARALLEL) 



♦O 




T 2 T 4 T 6 T, 

Figure 4. 
The distributions for link times (7 1 ,) are identical and independent exponentials 

(3.1) P{Ti>x}= e-^ 10 forx^0 

= 1 for x < 0. 

All of these examples may be analyzed analytically, and the actual expected completion times and 
variances, etc. have been calculated. 

As a check of the methods suggested, antithetic and three-way stratified sampling was applied, 
as in section 2, A and B, to estimate the expected completion time. The results concerning variances 
reduction are summarized in Tables 1 and 2. 

/ T+T'\ 

Table 1. Antithetic Variates: Variance Reduction [T* = antithetic realization, — - — I 



Network 

(1) 


No. of 
realiza- 
tions, A' 

(2) 


\£r(r) 

N 

(3) 


Var(7") 
N 

(4) 


Average of 

columns 

3 &4 

(5) 


(6) 


v2r(r*) 

N 

(7) 


Xt 

(8) 


1 
1' 
1 
1 


50 
100 
200 
400 


5.341 
2.830 
1.526 
0.760 


5.986 
2.650 
1.443 
0.691 


5.664 
2.740 
1.485 
0.726 


5.774 
2.887 
1.443 
0.722 


1.849 
0.852 
0.428 
0.227 


3.12:1 
3.39:1 
3.37:1 
3.18:1 


2 

2 
2 
2 


50 
100 
200 
400 


10.110 
5.133 
•2.375 
1.233 


10.724 
4.781 
2.531 
1.205 


10.417 
4.957 
2.453 
1.219 


10.000 
5.000 
2.500 
1.250 


1.814 
0.905 
0.444 
0.225 


5.51:1 
5.53:1 
5.63:1 
5.56:1 


3 
3 
3 
3 


50 
100 
200 
400 


10.486 
4.685 
2.601 
1.319 


9.537 
5.553 
2.388 
1.243 


10.011 
5.119 
2.495 
1.281 


10.000 
5.000 
2.500 
1.250 


3.644 
1.855 
0.829 
0.422 


2.74:1 
2.70:1 
3.02:1 
2.96:1 



„ . „ . Var(D Var(r) 

Column A i is rr- — — '(computed analytically) 



A 



Y 



Column X-i is the ratio describing variance reduction; X\ divided by 



v7r(T* 



448 



J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS 



TABLE 2. Stratification: Variance Reduction 



ion(r** = 



stratified realization, 



T+T' + T' 



Network 

(1) 


No. of 
realiza- 
tions, N 

(2) 


Var(D 

N 

(3) 


Var(7") 
N 

(4) 


Var(r') 
N 

(5) 


Average of 
columns 
3, 4, & 5 

(6) 


(7) 


Var(7'**) 
N 

(8) 


X-, 
(9) 


1 
1 
1 
1 


50 
100 
200 
400 


6.368 
2.846 
1.464 
0.758 


5.869 
3.091 
1.595 
0.704 


5.761 
2.939 
1.355 
0.693 


5.999 
2.959 
1.471 
0.718 


5.774 
2.887 
1.443 
0.722 


1.171 
0.631 
0.319 
0.153 


4.93:1 
4.58:1 
4.52:1 
4.72:1 


2 
2 
2 
2 


50 
100 
200 
400 


9.565 
5.090 
2.479 
1.235 


9.831 
4.713 
2.411 
1.243 


10.235 
5.064 
2.457 
1.203 


9.887 
4.956 
2.449 
1.227 


10.000 
5.000 
2.500 
1.250 


1.946 
0.934 
0.454 
0.229 


5.14:1 
5.35:1 
5.51:1 
5.46:1 


3 
3 
3 
3 


50 
100 
200 
400 


9.942 
5.037 
2.754 
1.378 


10.737 
5.569 
2.432 
1.225 


10.001 
5.561 
2.542 
1.272 


10.226 
5.389 
2.576 
1.292 


10.000 
5.000 
2.500 
1.250 


2.422 
1.258 
0.592 
0.311 


4.13:1 
3.98:1 
4.22:1 
4.02:1 



Column X\ is 



Var(D Var(7") Var(7"') 



N 



N 



N 



Column X-2 is the ratio of variance reduction; X\ divided by 



Var^**) 

N 



As can be seen, the variance reduction obtained by utilizing the antithetic or stratification procedure is 
greater than that from doubling the sample size and sampling independently. 

We remark, in passing, that the estimate of completion time variance obtained by simple averaging 
of the variance estimates obtained from the antithetic experiments, i.e., the estimate in Column 5 of 
Table 1, is likely to be biassed. Agreement with the calculated values seems, however, to be quite 
satisfactory for the present example. 

In the second example control variates are applied to Network 1 (Wheatstone) above. 

EXAMPLE 2: Obvious possible control networks to use in connection with Network 1, as shown 
in Figure 5, are: 





control i * 



CONTROL 2 * 



Figure 5. 



*The random variables, T t , in the control networks are identical to those in the original Network 1. 



SIMPLE STOCHASTIC NETWORKS 



449 



The completion time for Network 1 may be written as 

(3.2) r=max (T, + T>, T, + T, + T 5 , T 4 + T 5 ), 
while that for Controls 1 and 2 are, respectively, 

(3.3) C(l) = T l + T 3 + T 5 

C(2) = T 1 + max(T-,, T :i + n) 

Again the true values may be computed: 

E[T] =34.58 

(3.4) £[C(1)] = 30 

£[C(2)] = 32.5 

Table 3 illustrates the effect of the technique. In the table, the control estimate is 

(3.5) fcu)=T+(E[C(i)]-C(i)), 1=1,2. 

Here the mean-squared error (m.s.e.) reduction is tabulated in addition to the variance reduction; 
the m.s.e. is estimated by averaging the squared difference between the control estimate and E[T~\. 



Var [T] = 289 
Var [C(l)]=300 
Var [C(2)]=294. 



TABLE 3. Effects on Variances 







Number of r« 


:alizations, N 




50 


100 


200 


400 


f 


37.74 


33.84 


34.01 


34.84 


Var (T)IN 


6.23 


2.35 


1.47 


0.75 


C(l) 


27.31 


29.32 


29.13 


30.34 


T<(t) 


35.05 


34.02 


34.88 


34.50 


Var (7V >)//V 


1.97 


0.72 


0.41 


0.36 


Var. Red. 


3.16:1 


3.27:1 


3.59:1 


2.88:1 


M.S.E. Red. 


3.86:1 


4.03:1 


2.98:1 


3.04:1 


C(2) 


34.84 


31.35 


32.04 


32.70 


Tcm 


35.40 


34.99 


34.47 


34.64 


Var (Tcm)IN 


0.82 


0.46 


0.25 


0.15 


Var. Red. 


7.59:1 


5.11:1 


5.88:1 


5.00:1 


M.S.E. red. 


5.09:1 


4.17:1 


4.93:1 


4.03:1 



The reader will observe that the variance and squared error improvements obtained with the first 
control variate are roughly the same as that achieved with the antithetic variate and stratification 
procedures; however, the second control variate yielded improvements almost twice as large as those 
of the other Monte Carlo techniques. 



450 J- M. BURT, Jr., D. P. GAVER, AND M. PERLAS 

EXAMPLE 3: This example refers to the network depicted in Figure 6. 




Figure 6. 



It will be of interest to apply some of our techniques to this problem, since it exhibits greater complexity 
than the examples previously treated. 

Obviously, complication is introduced by such "crossing" links as link 10, having completion time 
Tio. If all link times have the same mean (assumed here) it seems natural to use the partial network of 
Figure 7 as a control. Of course, other possibilities (e.g., that involving links 1, 5, 10, 7, 8, and 9) are of 
interest as well. It turns out that the control network of Figure 7 can be completely analyzed numerically 
(see the next section) if link times are independently and exponentially distributed. Supposing that 
E[Ti] = 10, then, we find that 



(3.6) 



E[Vft\ = 47.5, Var[7™] = 418.8, 



where T ( ['l denotes the time to complete the control network of Figure 7. 




Figure 7. 



SIMPLE STOCHASTIC NETWORKS 



451 



Table 4 summarizes our sampling results when the number of repetitions is 25. 

SAMPLING RESULTS 
TABLE 4. Number of Repetitions: iV = 25 





Straight 


Antithetic 


Antithetic 
estimate 


Control 
estimate 


Control- 
regression 


Means 


Full Network {Tab): 


58.09 


50.95 


54.5 


57.4 


57.5 


Control Network(TW): 


48.2 


42.7 


45.5 






Variances 


Full Network: 


790 


373 


208 


314 


298 


Control Network: 


689 


267 


111 







The antithetic and control procedures adopted here reduce the straightforward variance of the esti- 
mate by one-half to one-third. 

The above experiment was repeated with iV = 50 independent repetitions. The relative variance 
reductions were found to be quite comparable to those for TV = 25 (see Table 5). 

SAMPLING RESULTS 
TABLE 5. Number of Repetitions: N = 50 





Straight 


Antithetic 


Antithetic 
estimate 


Control 
estimate 


Control- 
regression 


Means 


Full Network 


54.6 


54.2 


54.4 


54.2 


54.3 


Control Network 


47.9 


46.8 


47.4 






Variances 


Full Network 


328 


395 


105 


110 


91 


Control Network 


390 


382 


116 







EXAMPLE 4: Again the network considered is that of Figure 6. The experiments carried out are 
related, but somewhat more extensive, than those of the previous example. We have completed inde- 
pendent samples of A/=25 and A/=50 (10 of each). From each sample (25 and 50) we computed the 
various estimates obtained in the last experiment, and estimates of the variances of those estimates. 

Two additions were made to the previous experiment. First, an alternative control network, de- 
picted in Figure 8, is utilized for comparison with that of Figure 7. 



452 



J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS 




Figure 8. 



The control networks of Figures 7 and 8 will be called the Upper Control Network and the Lower 
Control Network, respectively. Second, a combined antithetic-control estimate is added to the list of 
those considered. In brief, the antithetic estimators for Tab and T A C ^ are constructed; these are 



(3.7) 



Tab + Tab j 
tab = 7, , and 



7KC)_l. fie)' 



The same random numbers that appear in T A c d appear also in Tab as was true earlier; their antithetic 
versions appear in Tab and T\ C J' . An unbiassed control estimate of E\Tab\ may be created using the 
variables (3.7): 



(3.8) 



E[TAB]A,C = EWJ) + [TAB-f%>] 

= E[T%)] + [tab-t a W. 



(E[Tab]a,c denotes the combined antithetic-control estimate of E[Tab] )■ Finally, a regression adjusted 
estimate of the form 



(3.9) 



E[tab]a,c,h = -Tab + P(t^-E[T a ^]) 



is considered, where the constant /3 is so selected as to approximately minimize the mean-squared error 
of the estimate; see (2.24). (E[T A b]a,c,k denotes the regression adjusted antithetic-control estimate 
o(E[Tab]). 

The results of the simulation experiments are presented in Tables 6 and 7. 

Examination of the tables reveals the effects of the several variance-reducing maneuvers employed. 
Roughly speaking, a halfing of the variance of the estimate results when either antithetic or control 
methods alone are used. Present evidence indicates that the Upper Control Network is the more 
effective as a control variate. If both the antithetic and the control devices are used simultaneously, 
as in (3.8) the variance is halfed again. 

A word about the regression-adjusted estimates is in order. It may be shown that the parameter 
/3 in (3.9) is best taken to be 



j8« = 



-Cov \r A c B \ tab] 
Var[r<f s >] 



SIMPLE STOCHASTIC NETWORKS 



453 



SIMULATION RESULTS 
Table 6. Means: A/= 25 



Runs 


Straight 


Control 


Control- 
regression 


Antithetic 


Antithetic- 
control 


Regression- 
adjusted 
antithetic- 
control 


Lower Control Network 


1 
2 
3 
4 
5 
6 
7 
8 
9 
10 


50.1 
56.9 
53.9 
49.6 
54.3 
55.1 
54.2 
54.1 
57.6 
56.8 


50.5 
54.6 
56.0 
52.7 
54.3 
52.3 
53.5 
51.8 
52.9 
56.1 


51.2 
55.6 
55.4 
52.5 
54.3 
52.3 
53.5 
52.1 
54.8 
56.4 


54.5 
54.5 
55.0 
52.3 
51.2 
53.4 
54.6 
56.8 
51.8 
57.3 


51.4 
56.4 
55.6 
53.8 
54.1 
54.3 
54.7 
54.1 
53.0 
56.6 


51.4 
57.1 
55.5 
54.0 
53.6 
54.0 
54.7 
54.5 
52.9 
56.7 


Average 


55.1 


53.5 


53.8 


54.1 


54.4 


54.4 


Var. of estimate 


6.2 


3.2 


3.1 


4.0 


2.4 


2.7 


S.D. of estimate 


2.5 


1.8 


1.8 


2.0 


1.56 


1.63 


Upper Control Network 


1 
2 
3 
4 
5 
6 
7 
8 
9 
10 


50.1 
56.9 
53.9 
49.6 
54.3 
55.1 
54.2 
54.1 
57.6 
56.8 


57.4 
55.1 
51.7 
53.9 
52.1 
54.7 
55.3 
54.8 
50.5 
55.8 


57.5 
55.9 
52.0 
52.7 
52.4 
54.7 
54.9 
54.7 
52.0 
56.2 


54.5 
54.5 
55.0 
52.3 
51.2 
53.4 
54.6 
56.8 
51.8 
57.3 


56.5 
53.7 
54.2 
53.4 
52.5 
53.3 
55.0 
55.9 
53.3 
56.3 


56.5 
53.6 
54.2 
53.4 
52.3 
53.3 
55.0 
55.9 
53.0 
56.4 


Average 


55.1 


54.1 


54.3 


54.1 


54.3 


54.4 


Var. of estimate 


6.2 


4.4 


3.8 


4.0 


2.0 


2.3 


S.D. of estimate 


2.5 


2.1 


1.9 


2.0 


1.4 


1.5 



454 



J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS 



SIMULATION RESULTS 
Table 7. Means: N = 50 















Regression- 


Runs 


Straight 


Control 


Control- 
regression 


Antithetic 


Antithetic- 
control 


adjusted 
antithetic- 
control 


Lower Control Network 


1 


54.6 


56.0 


55.6 


54.4 


54.1 


54.2 


2 


55.5 


56.5 


56.2 


57.5 


54.9 


55.6 


3 


54.7 


55.2 


55.0 


55.5 


54.7 


54.8 


4 


56.1 


58.6 


57.8 


55.2 


56.8 


56.8 


5 


51.1 


51.2 


51.2 


51.9 


54.2 


53.6 


6 


50.5 


56.7 


54.6 


53.3 


54.9 


54.3 


7 


58.7 


55.4 


56.2 


55.1 


54.6 


54.8 


8 


52.8 


52.3 


52.3 


56.2 


53.4 


53.9 


9 


58.9 


54.2 


55.6 


57.3 


55.6 


55.9 


10 


54.2 


52.5 


52.9 


52.7 


52.9 


52.9 


Averages 


54.7 


54.9 


54.7 


54.9 


54.6 


54.7 


Var. of estimate 


7.9 


5.3 


4.1 


3.5 


1.2 


1.22 


S.D. of estimate 


2.8 


2.3 


2.0 


1.9 


1.09 


1.10 


Upper Control Network 


1 


54.6 


54.2 


54.3 


54.4 


54.6 


54.5 


2 


55.5 


54.8 


55.0 


57.5 


56.1 


56.5 


3 


54.7 


52.7 


52.9 


55.5 


54.6 


54.8 


4 


56.1 


54.4 


54.7 


55.2 


53.8 


53.8 


5 


51.1 


55.9 


54.6 


51.9 


54.4 


53.8 


6 


50.5 


52.3 


51.9 


53.3 


54.4 


54.1 


7 


58.7 


53.1 


54.0 


55.1 


53.6 


53.7 


8 


52.8 


56.1 


55.4 


56.2 


57.2 


56.8 


9 


58.9 


56.7 


57.3 


57.3 


56.3 


56.4 


10 


54.2 


54.1 


54.2 


52.7 


54.8 


54.4 


Averages 


54.7 


54.4 


54.4 


54.9 


55.0 


54.9 


Var. of estimate 


7.9 


2.2 


2.1 


3.5 


1.4 


1.5 



S.D. of estimate 



2.8 



1.5 



1.4 



1.9 



1.2 



1.2 



SIMPLE STOCHASTIC NETWORKS 



455 



and it is obvious that the covariance term must be estimated from data; see [4, p. 67]. In the experi- 
ments, a /3-estimate was formed for each sample of 25 or 50 and used to create the Control-Regression 
and Regression Adjusted Antithetic-Control figures. 

4. ANALYTICAL METHODS BASED ON EXPONENTIAL DISTRIBUTIONS 

The assumption that exponential distributions govern fink times in networks often results in 
considerable mathematical simplicity. Of course the link times encountered in real problems are apt 
not to be of exponential character. Their densities are usually expected to exhibit a mode (peak) at 
some positive time value, and perhaps to have a vaguely normal or Gaussian appearance. 
Two remarks may be made: 

i) If empirical observation indeed suggests a non-exponential form of the type described above, 
combinations (e.g., convolutions) of exponentials may be used as approximations. 

ii) Certain types of operations may be expected, theoretically, to give rise to near-exponential 
distributions. These are of a quasi-Sisyphean character: link traversal terminates when a certain task 
is successfully completed, but if success does not occur the task must be repeated; success occurs on 
any attempt with small independent probability, p. Certain computer program debugging, or other 
trouble shooting, processes may be of this nature. Construction completion times might have a similar 
character if weather conditions intervene and necessitate repetitions. In another context, the time re- 
quired to pass through a part of a transportation route on which one may encounter randomly placed 
congestion points, each of which offers independently and similarly distributed delay. If the number 
of congestion points has a geometric distribution of a reasonably large mean, the time required has 
exponential properties. Thus the exponential may be worth consideration, and we shall use it as a basis 
for discussion here. 

These simple properties of independently and exponentially distributed random variables are 
well known or easily derived: (a) that sums of identically distributed exponential variables are gamma 
distributed, (b) that sums of differently distributed exponential variables are a linear (nonconvex) 
combination of exponentials, and (c) that the maximum of n exponential random variables is distributed 
as a product of the respective distributions. Properties (a) and (b) bear on the modelling of simple series 
networks, while (c) relates to finks in parallel. Sometimes it is possible to build up complex network 
distributions from simpler ones as follows. 




Figure 9. 

The dotted path, in Figure 9, represents a complex network with Tv as completion time, while T L is 
a single exponential path in parallel. Then 

(4.1) P{Tab ^ x) =(l-e"'«) U(x), 

where U is the d.f. of T u . If we now introduce the Laplace transform E[e~ sT --m] we find that 



456 



J. M. BURT, Jr.. D. P. GAVER, AND M. PERLAS 



'[e- sT AB] = (' 
Jo 






(4.2) 



e- sx (\-e-> lx )dU(x)+ e- sx U{x)e-' J - ,c ixdx. 



_±L 



= U(s)-U(s+^)+ j^U(s + ijl), 



U (s) being the Laplace-Stieltjes transform of U. This procedure can obviously be repeated to generate 
further exponential parallel links, with the last transform generated serving as U(s) at the next stage. 
Consider 

SPECIAL CASE A: 



(4.3) 



Then immediately 



(4.4) 



U(x) = \-e~ kx x^O 
= x < 0. 



E[e- sT AH] = 



A | £*_ 



k + S k + fJL + S S + /X k + fJL+S 

k /A k + fX 

k + S /X + S k + /JL + S~ 



If k = (x the above becomes 



«'^-(l&j)( 



k + s 



Thus T ah is the sum of an exponential with mean (2X) - ' and another with mean A -1 ; this follows directly 
from the Markovian (memoryless) property of the exponential density. 

A useful generalization of the above situation occurs when 77 has the distribution 



(4.5) 



P{T,.^x}=l-^ C je-»j x . 



Then basic transform properties provide that 



(4.6) ]E[e-TAB] = jj 



l-]£ Cje-W 



U(x)dx= U(s) -^ cjU(s + fij): 



here the Laplace transform of the distribution U is 



(4.7) 



v(s)= r e~ sx u(x)dx=- r 

Jo 5 Jo 



e-'*dU(x)=-U(s). 
s 



Consequently, 
(4.8) 



E[e- sT AH] = U(s) -s f — CJ — U(s + fjLj) 



SIMPLE STOCHASTIC NETWORKS 



457 



SPECIAL CASE B: Suppose 77. is distributed as a convex combination (probability mixture) 
of different exponentials; i.e., with probability c } the lower link time is exponential with parameter 
fXj. Transform (4.8) then represent this situation. Repetition allows many parallel paths to be added. 

SPECIAL CASE C: Now take 77 to be distributed as a sum of two dissimilar exponentials: 



(4.9) 



Ele-fq =- J ~ ■ 



fJ-2 



fJL\+ S /JL> + S 



U2 



Ml 



Mi 



fl2 



(X> — /Mi fJLi+S /JL> — Ml /U-2 + 5 



Thus 77. has the distribution (4.5) with Ci = /.iii/xi — fit)~\ and c 2 = — Mi(m^ — M-i) '• More dissimilar 
components may be handled by applying a partial fractions decomposition to the transform of 77. 

SPECIAL CASE D: Consider the problem of Example C, but put fx> = /m + 8, later allowing 8— » 0. 
77. thus has a gamma distribution. We find that 



(4.10) 

and when 8— > 

(4.11) 



E[e-* T *«] = U{s) -j— £/(s + Ml )+f 

5 ~r fJL\ O 



Ujs + fii + S) U(s + fii) 
s + fjii + 8 s + m-i 



E[e- sT AH] = U(s) -f— Uis + ^+s-jz 

s + fA\ at, 



uu) 



e-8 + m 



(4.12) 



The formula (4.11) can be derived directly. Suppose 77. has the gamma distribution 

(ax) k ~ l 



P{T L ^x}=l-e-°*(l+^+^+ 



(k-l)l 



Then by standard transform properties 



(4.13) -E[e~* T AB ]= f" 

5 Jo 



or 



(4.14) 



ax 



A = 1 (— a) j d.J ~ 



j = 



/! # J 



£=s + a 



'>•-• (—a)-' JJ 

£[ e -^]=f/( 5 )- 5 2 i -r^^7 



(a*)*- 



1 — p- ax \ 1 + - h 

1! (A-DI/J 



U{x)dx. 



J = 



U(0 



the zero-order derivative is taken to be the function itself. Clearly (4.11) and (4.14) agree when k = 2 
and a = fii. 

SPECIAL CASE E: Refer to Example 3 of the privious section; the subnetwork of Figure 7 may 
be studied directly by means of (4.14). First, put 



(4.15) 



U(s) = E[e- s ^ T < ) ]- 



t* 



/JL + S 



458 



J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS 



Then according to (4.14) the transform associated with the completion time of the parallel network 
involving link times Tz, T3, T 5 , and T 6 is 



E[e~ sT ]- 



(4.16) 



- t x 



M 



fi y s 1 fx 



\ d 



A* 



d€l?\n+€ 



l = M + s 



P- + S/ 5 + IX\2/X + S 



— 5 



^ + 



2/Ll 3 



(ix + s) 2 (2fx + s) 2 (fJL + S )(2fX + s) 3 



It is now only necessary to multiply this expression by the transform of the sum of links involving T\ 
and r 4 , namely I — ^7— ) , in order to obtain E[e~ sT AB] for the subnetwork of Figure 7. The result may 



be simplified to 



(4.17) 



/x + s 



E[e- sT An] = 2 



M 



(X + S 



M 



IX 



fJL + S/ \2/JL + S 



+ 2 



M 



^ 



/i + s/ \2fi + s) 



Differentiation then produces the mean, variance, and other desired moments. Although it is appar- 
ently possible to invert transforms mathematically the results will typically appear as incomprehensible 
combinations of exponentials. We favor the numerical inversion of such transforms, using perhaps the 
method of Abate and Dubner [1]. For given link time distributions one can thus quickly develop a 
table of the distribution function of Tab for the approximating network. 



REFERENCES 

[1] Abate, J. and H. Dubner, "Numerical Inversion of Laplace Transforms by Relating Them to the 
Finite Fourier Transform," J. of the ACM, Vol. 15, No. 1 (Jan. 1968). 

[2] Charnes, A., W. W. Cooper, and G. L. Thompson, "Critical Path Analyses Via Chance Con- 
strained and Stochastic Programming," Operations Research 12, 460-470 (1964). 

[3] Feller, W., An Introduction to Probability Theory and Its Applications (John Wiley and Sons, 
New York, 1966), Vol. II. 

[4] Hammersley, J. and D. C. Handscomb, Monte Carlo Methods (Methuen and Co., Ltd., London, 
1967). 

[5] Hartley, H. O. and A. W. Wortham, "A Statistical Theory for PERT Critical Path Analysis," 
Management Science, 12, 469-481 (June 1966). 

[6] Jewell, W., "Risk Taking in Critical Path Analysis," Management Science, 11, 438-443 (Jan. 
1965). 

[7] Kelley, J. E., "Critical Path Planning and Scheduling; Mathematical Basis," Operations Re- 
search, 9, 296-320 (1961). 

[8] MacCrimmon, K. R. and C. A. Ryavec, "An Analytical Study of the PERT Assumption," Opera- 
tions Research, 12, 16-37 (1964). 

[9] Martin, J. J., "Distribution of the Time Through a Directed Acyclic Network," Operations 
Research, 13, 46-66 (1965). 
[10] Moder, J. and C. Phillips, Project Management with CPM and PERT (Reinhold Pub. Co., New 
York, 1964). 



SIMPLE STOCHASTIC NETWORKS 459 

[11] Van Slyke, R. M., "Monte Carlo Methods and the PERT Problem," Operations Research, 11, 
839-860 (1963). 

[12] Wiest, J. D., "Some Properties of Schedules for Large Projects with Limited Resources," Opera- 
tions Research, 12, 395-418 (1964). 



A NETWORK ISOLATION ALGORITHM* 

M. Bellmore 

The Johns Hopkins University 

G. Bennington 

North Carolina State University^ 

and 

S. Lubore 

The MITRE Corporation 

ABSTRACT 

A set of edges D called an isolation set, is said to isolate a set of nodes R from an un- 
directed network if every chain between the nodes in R contains at least one edge from the 
set D. Associated with each edge of the network is a positive cost. The isolation problem is 
concerned with finding an isolation set such that the sum of its edge costs is a minimum. 

This paper formulates the problem of determining the minimal cost isolation as a 0-1 
integer linear programming problem. An algorithm is presented which applies a branch and 
bound enumerative scheme to a decomposed linear program whose dual subproblems are 
minimal cost network flow problems. Computational results are given. 

The problem is also formulated as a special quadratic assignment problem and an algo- 
rithm is presented that finds a local optimal solution. This local solution is used for an initial 
bound. 



INTRODUCTION 

Given an undirected connected network G = [N; E] with r sets of distinguished nodes, Rj C TV, 
for 7= 1, 2, . . ., r; a set of edges D C E, called an isolation set (or isolation) has the property that the 
removal of the edges in D partitions the graph into p 5= r connected components G\, G-z, . . ., G,,. 
In addition, for the set D to be an isolation with respect to the r distinguished node sets, the p 3= r 
connected components must be capable of being grouped into exactly r sets, Pj for j= 1, . . . ,r, such 
that each set Pj contains all of the distinguished nodes in Rj. Associated with each edge (x, y)eE of 
the graph is a positive cost of removal, c(x, y) > 0. A minimal cost isolation, Do, is an isolation set, 
the sum of whose edge costs is a minimum. 

One example of an isolation problem is that faced by an antagonist who desires to attack a com- 
munications network in a manner which prevents specific sets of nodes Rj from communicating with 
other sets of nodes /?,-, i ¥>j. If a cost is prescribed for the attack of each fink in the network (it is as- 
sumed that the cost of attacking nodes is prohibitively high), then the attacker wishes to find a minimal 



*This paper was presented at the 35th National ORSA meeting in Denver, Colorado, on June 17, 1969. 
tThis work was performed while the author was at The MITRE Corporation. 

461 



462 



M. BELLMORE, G. BENNINGTON, AND S. LUBORE 



cost isolation. Problems of a similar type have been investigated and solved by Greenberg [5] and 
Jarvis [6] in a different fashion than the algorithm presented in this paper. 

A variant of the backboard wiring problem [9] can also be formulated as an isolation problem: 
consider an existing system composed of subsystems. New components are to be added to the existing 
system. Each component to be added requires additional wires to be connected to each subsystem 
and to each other component. Let c(x, y) be the number of wires to be added between x and y. Assum- 
ing sufficient space in each subsystem to accommodate any or all of the additional components, then 
one desires to place the new components in the existing subsystems in a manner that minimizes the 
number of additional wires between the subsystems. The set Rj corresponds to the 7th subsystem. 

Bennington [1] has related the problem of determining the minimal multicommodity disconnecting 
set in an undirected graph to the isolation problem. 

Example: 

Figure 1 shows an undirected graph associated with a simple example of the isolation problem. 



o 



C(l,4): I 



<3,4)=l 



G> 



C(2,5) = 2 




Figure 1. Example of the isolation problem 



It is desired to isolate nodes 1, 2, and 3 from each other. The number adjacent to the edges in the graph 
is the cost to remove the edge. The minimal cost isolation has a value of 3 corresponding to removal 
of the edges 

{(1,4), (2,5)} or {(1,4), (3,4), (3,5)}- 



THE ISOLATION PROBLEM AS A 0-1 INTEGER LINEAR PROGRAM 

Suppose that an isolation has been found for the network G= [N; E] then each node xeN is con- 
tained in exactly one of the partitions Pj, j= 1, . . ., r. In addition each Rj C Pj, 
Let 

1, xePj, 



^ { * ) = lo', xePt, l t): 



Each undirected edge of the network is assigned an arbitrary orientation, that is, each edge is 
defined by one of the ordered pairs (x, y) or (y, x) corresponding to the endpoints (nodes) of the 
edge. The isolation problem is stated as: 



Minimize 
subject to 

(1) 



Z= Y c(x,y)\y j [y j (x,y) + d j (x,y)] 

(x7y)eE \j=l 



y ttj(x)=1, xeN-URj, 



NETWORTH ISOLATION ALGORITHM 453 

1 , xeRj , 



(2) irj(*) = 

[ 0, xeRi, i ¥=j, 



j=l, . . ., r, 



(3) 7T j (x)-TT J {y) + y j (x, y)-8j(x, y) = 0, (x,y)eE, 

(4) TTjix), Jj(x,y), 8j(x, y) are 0—1 variables. 

Equations (2) fix each of the distinguished nodes xeRj into a single subset Pj. 

If nodes x and y are in the same set, say P s , corresponding to a feasible isolation, then v s {x) 
— TTs(y) = 1 and from Eqs. (1) and (2),ttj(x) —ifj(y) = 0,j^ s. Further, jj(x, y) — 8j(x,y) =0,j=l, . . ., 
r is feasible to Eqs. (3) and since, c(x, y) > 0, minimizes Z for this isolation. 

If nodes x and y are in different sets, say P s and P t , respectively, corresponding to a feasible isola- 
tion, then TT s (x) = TT t (y)= 1; and from Eqs. (1) and (2) ttj(x) — 0, j ^ s; and 77j(y) = 0, j¥* t. Further, in 
order to satisfy Eqs. (3) y s (*, y) — 0, 8 s {x, y)=l; and jt(x, y)= 1, 8 t (x, y) = 0. Since c(x, y) > 0, 
jj{x, y) = 8j(x, y) = 0,j¥=s, t. 

From the above arguments, an edge (x, y) is contained in an isolation if either jj(x, y)=l or 
8j(x, y) = l for some index j. Further, since for each 8 s (x, y)= 1 there exists a 8 t (x, y)= 1 and both 
contribute to the value of the objective function, 1/2 Z is the value of a minimal cost isolation. 

The proposed algorithm solves the isolation problem using the Land and Doig [7] branch and 
bound enumerative scheme with restrictions (4) replaced by nonnegativity constraints (i.e., as a series 
of linear programs). Each of the linear programs is decomposed using the Dantzig-Wolfe technique 
[3]. Equations (1) which link the r subproblems together correspond to the master problem. 

Theyth subproblem is: 

Maximize W= ^ <t(x)ttj{x) 

- ^ c(x, y)[jj{x, y) + 8j(x, y)] , 

( x, y)(E 

subject to: 

(2) 

(2A) 

(3) TTj{x)-TTj(y) + jj{x, y)-8j(x, y)=0, {x,y)eE, 

(4A) ttj(x), yAx,y), 8j(x,y)>-0, 

where (t{x) is the simplex multiplier corresponding to each constraint (1) in the master problem for 
the free nodes (i.e., x^Rj, i=l, 2, . . ., r) , and cr(x) = for the fixed nodes (i.e., 




xe 



u Ri) 



Constraints (2A) have been included in order to insure that the subproblems are bounded. Unless 
the subproblem is bounded, no feasible dual solution can be found. Since the solution technique to 



464 M. BELLMORE, G. BENNINGTON, AND S. LUBORE 

be employed requires the dual solution to these subproblems, this restriction is important. 

An alternative formulation of the 7th subproblem that is easier to deal with is given by (the sub- 
script j has been dropped from all of the variables to alleviate the notational burden): 

Minimize W —— ^ (t(x)tt(x) + ^ c(x,y) [y(x, y) + d(x, y) ] , 

subject to 

f(x, y): tt{x) — 7r(y) +y(x, y) — 8(x, y) = 0, (x,y)eE, 

r =-l,xeRj, 
p(x): -tt{x) {= 0, xeRi, i^j, 

l,xeN-U Ri, 

tt(x), y(x, y), 8(x, y) 3=0. 

The simplex multipliers associated with the dual of this problem are given to the left of their cor- 
responding equations. 

For notational convenience, let f{x, W) = V f{x, y) and let f(N, x) = V f{y, x) . 

yeN ytN 

The dual of this problem is 

Maximize T=^ b(x)p(x), 

xtN 

,,'tfxeRi, i^j, 
where (r 






otherwise, 



subject to 



tt(x): f(x,N)-f(N,x)-p(x)^-a-(x), xtN, 

y(x,y): f(x, y) ^ c(x, y),) 

8(x,y): —f(x,y)^c(x,y),\ 

f(x, y) unrestricted, (x, y)eE, 



p(x) 



0, if*e7V- U R h 



1 = 1 



unrestricted, if xe U /?,-. 



As before, the dual variables are given to the left of the constraint equations. This problem resembles 
a single commodity flow problem with the exception of the/(;t, N) —f(N,x) —p(x) =£ — a(x) equations 
and the p(x) variables. This problem can be transformed into a more recognizable form. Upon adding 
the nonnegative slack variables, q{x) , to the/(x, N) —f(N, x) — p(x) =£ — cr(x) equations, the problem 
becomes 

(5) Maximize T= ^ b(x)p(x), 



xtN 



subject to 



NETWORTH ISOLATION ALGORITHM 



465 



(6) 

(7) 

(8) 
(9) 



f(x, N) -f(N, x) -p(x) +q(x)+ o-(x) = 0, xeN, 

— c(x, y) =£/(*, y) «=c(x, y), (x,y)eE, 

^0, if xeN- U Ri, 

P(x)l '~ l r 

unrestricted, if xe U /?,-, 

i= i 

q(x) >0, xeN. 
Consider the redundant constraint formed by summing the negative of Eq. (6): 

-^ f(x, N) +Z f(N, x) + "£ p(x) -^ q(x) -^ a(x) =0. 



X€N 



xc\ 



Noting that ^f(x, N) = V f(N, x), the redundant constraint can be simplified to 

x*N xeN 

(10) X [p(x)-q(x)-a(x)]=0. 

XfN 

The constraint matrix formed by including this additional constraint has exactly one + 1 and one 
— 1 in each column. Therefore, this problem has a representation as a network flow problem. The 
directed network corresponding to the dual linear program of the jth subproblem will be denoted by 
G= [N 1 ; A']. The set of nodes N' is formed from the original set N by adding a dummy node 5'. The 
set of directed arcs A' is formed from the original set of edges E by replacing each edge with two di- 
rected and oppositely oriented arcs and by adding additional arcs to correspond to the variables q(x), 
p{x), and constant cr(x). Since the nonnegative variable q(x) has a coefficient of + 1 in Eq. (6) and a 
coefficient of — 1 in Eq. (10), an arc (x, s') is added to A' with an arc flow variable q(x, s') to replace 
q(x). In a similar manner, cr(x) is replaced by a constant arc flow variable cr(x, s') or or (5', x), de- 
pending on the sign of the simplex multiplier. The variable p(x) is replaced by an arc flow variable 
p (s' , x) if p (x) 2= and by the arc flow variables p (s' , x) and p (x, s' ) if p (x) is unrestricted in sign. 
The characteristics of the arcs in A' are summarized in Table I. 

TABLE I. Maximum Cost Flow Problem (Subproblem j) 






Arc 


Lower 
bound 


Upper 
bound 


Cost 


Comments 


/(*. y) 


-c(x, y) 


c(x, y) 





Replace with two 
directed arcs 


<r(x, s' ) 
<t(s', x) 


o-(x) 

-o-U) 


a(x) 
-a-(x) 






Ifo-U) &0 
lfo-(x)<0 


q{x, s') 





00 







p(s\ x) 






00 
00 



-1 


otherwise 


p(x, s') 






00 
00 


1 



xeHj 
.ve/\\, / 9&J 



466 



M. BELLMORE, G. BENNINGTON, AND S. LUBORE 



Figure 2 contains a portion of a graph that will be used to illustrate the form of the subproblems. 




c(l,2)=l 



C(l,3)=l 



c(2,3) = 3 

FIGURE 2. Subproblem network 

Assume that in a subproblem 7r(l) «£ 1, 7r(2)=0, and 7r(3) = l; and the simplex multipliers are 
cri = l, cr 2 = 2, and 0-3 = — 1. The network G= [N';A'] corresponding to this subproblem is shown in 
Figure 3. Note that for this particular example some of the parallel arcs can be coalesced with others 
or deleted completely. 



P(*,i) 

(0,«>,. I) 




f(3,l) 
(0,1,0) 



q(3,s) 
(O.oo.O) 



The first term adjacent to each arc denotes the type of flow variable and the second 
term is the lower bound, upper bound, and cost of the flow on that arc. 



Figure 3. Modified graph for subproblem 



NETWORTH ISOLATION ALGORITHM 467 

Fulkerson's Out-of-Kilter Algorithm [4] may be used to solve the subproblems. Although the 
Out-of-Kilter Algorithm is normally used to solve minimal cost flow problems, in order to apply the 
Fulkerson Algorithm the flow problem (primal linear program) must maximize the negative of the 
costs. Hence, since the dual variables obtained from the Out-of-Kilter Algorithm are required to solve 
the /th subproblem, the flow problem has been cast as one of determining a maximal cost flow solution. 

In this manner the columns to enter the basis of the master problem are iteratively generated until 
an optimum solution is obtained. 

A SPECIAL QUADRATIC ASSIGNMENT PROBLEM 

The isolation problem can also be formulated as a special case of the quadratic assignment problem. 
(See Lawler [8] for a discussion of the general quadratic assignment problem.) 

Minimize Z=l/2j £ £ c(x, y)7Tj(x)[l -TTj(y)], 

j=l xeN yeN 



subject to 



J ttj(x) = 1, xeN-U Ri, 

i=l 

1, xeRj, 



[0, xeRt, i^j, 
ttj(x)=0, 1, xeN, t 



7=1,- • -,r. 



Since 



XSS c(x,y)7T j (x)[l-7r j (y)] = ^ £ c(*.y)"2 £ £ c(x, y)nj{x)nj(y), 

j=l xeN yeN xeN ytN j=\ xeN yeN 

and ^ ^ c(a;, y) is a constant, the isolation problem can also be stated as follows: 

xeN yeN 

Maximize ^=1/2^^ c(x, y)TTj(x)7Tj(y) , 

j = 1 xeN yeN 

subject to 

£ 7Tj(x) = l, xeN- u Ri, 

1, xeRj, ) 



TTj{x) 

10, xeRi, i ¥= jA j=l, 

TTj{x) =0, 1, xeN, j 

AN ALGORITHM TO GENERATE LOCAL OPTIMA 

A modified version of an algorithm of Carlson and Nemhauser [2] is presented which generates a 
local minimum to the isolation problem. No proof of this algorithm is given since it is so similar to that 
of Carlson and Nemhauser. 

STEP 0: Define a feasible partition, Pj,j=l, . . ., r, of the nodes xeN. 

STEP 1: Select a free node, xjRj, j= 1, . . ., r, which has not been examined since the last 
change in the partition. 



468 



M. BELLMORE, G. BENNINGTON, AND S. LUBORE 



STEP 2: Let s denote the index of the set in which x is currently assigned (i.e., xeP s ) 



Calculate 



dj(*)=2 c(x,y)TTj(y)J=l, . . ., r. 

ycN 



Note that the movement of node x from set s to set t will result in a change in the isolation cost of 

d s (x) — d t {x). 

STEP 3: Calculate A = max {dj(x)}. 

j 
STEP 4: If A ^ d s (x) , then there exists a set P t such that d s (x)—d t (x) <0 and, hence, a solution 

with a smaller isolation cost may be found. Move node x into set t and go to Step 1. 

If A = d s (x) , and a free node has not been examined since the last change in the partition, go to 
Step 1. 

If A. = d s (x), and all free nodes have been examined since the last change in the partition, termi- 
nate; a local minimum has been found. 

COMPUTATIONAL RESULTS 

A computer program for the isolation problem has been written in FORTRAN IV for the IBM 
360/50 computer. The decomposed problems were solved utilizing the revised simplex method with 
product form of the inverse together with the branch and bound algorithm. The subproblems were 
solved utilizing the Out-of-Kilter algorithm as previously described. The computational results for a 
set of representative problems are given in Table II. 

Table II. Computational Results for the Isolation Problem 



Problem 

No. 


No. of 
nodes 


No. of 
edges 


No. of 
sets 


No. of IBM 360/50 
edges execution 
in Do time (sec.) 


1 


12 


18 


3 


6 20.7 


2 


15 


30 


3 


12 206.6 


3 


15 


50 


3 


13 725.3 


4 


15 


50 


4 


17 588.4 


5 


15 


50 


5 


21 401.4 


6 


15 


50 


6 


24 325.3 



The algorithm for generating local minima was also programmed in FORTRAN IV, all problems 
solved to date have been completed in less than 15 sec. 



ACKNOWLEDGMENT 

The authors wish to thank the MITRE Corporation, under whose auspices this work was performed, 
and Professor George Nemhauser of Cornell University for his many helpful suggestions and comments. 



NETWORTH ISOLATION ALGORITHM 469 

REFERENCES 

[1] Bennington, G. E., "Multi-Commodity Disconnecting Sets in Undirected Graphs by Node Partition- 
ing," Ph. D. Thesis, The Johns Hopkins University, Baltimore, Maryland (1969). 

[2] Carlson, R. C. and G. L. Nemhauser, "Scheduling to Minimize Interaction Cost," Operations 
Research, 14, 52-58 (1966). 

[3] Dantzig, G. B. and P. Wolfe, "The Decomposition Algorithm for Linear Programs," Econometrica, 
29, 767-778 (1961). 

[4] Fulkerson, D. R., "An Out-of-Kilter Method for Minimal Cost Flow Problems," J. Soc. Ind. and 
Appl Math., 9, 18-27 (1961). 

[5] Greenberg, Harvey, "Optimal Attack of a Command and Control Network," Ph. D. Thesis, The 
Johns Hopkins University, Baltimore, Maryland (1968). 

[6] Jarvis, John J., "Optimal Attack and Defense of a Command and Control Communications Net- 
work," Ph. D. Thesis, The Johns Hopkins University, Baltimore, Maryland (1968). 

[7] Land, A. H. and A. G. Doig, "An Automatic Method of Solving Discrete Programming Problems," 
Econometrica, 28, 497-520 (1960). 

[8] Lawler, E. L., "The Quadratic Assignment Problem," Management Science, 9, 586-599 (1963). 

[9] Steinberg, L., "The Backboard Wiring Problem: A Placement Algorithm," J. Soc. Ind. and Appl. 
Math. Revue, 3, 37-50 (1961). 



RESOURCE ALLOCATION FOR TRANSPORTATION* 

G. Bennington 

North Carolina State University^ 

and 

S. Lubore 

The MITRE Corporation 

ABSTRACT 

A linear programming formulation is described that will permit the optimal assignment 
of transportation resources (vehicles) and movement requirements to a network consisting 
of a set of designated origins, ports of embarkation, enroute bases, ports of debarkation, 
and destinations to achieve a prescribed schedule of arrivals. 

INTRODUCTION 

This paper describes the Resource Allocation For Transportation (RAFT) linear programming 
model, which allocates transportation resources and movement requirements to a set of routes for the 
optimal achievement of a prescribed schedule of arrivals. 

Several linear programming solutions of ship routing and scheduling problems have been pre- 
sented. Laderman, Gleiberman, and Egan [3] have investigated two problems of scheduling iron 
ore shipping on the Great Lakes. Fitzpatrick, Bracken, O'Brien, Wentling, and Whiton [2, 7] have 
investigated problems of determining the least cost mix of vehicles (air and sea) to satisfy time phased 
military shipping requirements. 

Rao and Zionts [5] give an algorithm to determine a set of cyclic ship routings for a nonhomogeneous 
fleet of ships that will minimize the total shipping costs. Their approach decomposes the linear program 
into a set of subproblems, each of which is a network flow problem. Dantzig, Blattner, and Rao [1] 
attack this problem in a different fashion in order to determine a set of vehicle routings that have 
minimum cost to time ratio. This determination is made by solving a series of shortest path problems 
in a network that represents all possible cyclic shipping schedules. 

Schwartz [6] presents a linear integer programming formulation of a problem dealing with the 
movement of known cargos from origin to destination on time at minimum fleet cost. Lewis, Rosholdt, 
and Wilkinson [4] give a comprehensive formulation of a problem dealing with strategic mobility, which 
they then treat as a single commodity flow problem through a time expanded network. 



*The work reported in this document was sponsored by the Defense Communications Agency under contract 
F19628-68-C-0365. 

tThis work was performed while the author was at the MITRE Corporation. 

471 



472 G - BENNINGTON AND S. LUBORE 

The RAFT model differs from existing strategic mobility planning LP models in the following 
respects: 

(a) It permits the optimum allocation of both transportation resources and movement requirements 
to available routes from origin to destination. 

(b) It allows for the transshipment of movement requirements subject to constant delays at enroute 
nodes and transshipment points. (Queues are not allowed at enroute nodes or transshipment points.) 

PROBLEM DESCRIPTION 

The problem addressed by this paper is one of moving military forces comprised of personnel and 
materiel from origin to destination over a network consisting of a set of ports (nodes) and vehicle 
oriented routes in order to achieve a desired schedule of arrivals. 

For the purposes of this paper, each item to be moved will be called a movement requirement (or 
requirement). Associated with each movement requirement is a unique origin and destination, together 
with a date when the requirement becomes ready to be moved from its origin and a preferred delivery 
date at its destination. Further, each such requirement is restricted to a given quantity of a particular 
commodity. 

The set of transportation resources is composed of the inventory of military aircraft and ships 
as well as commercially available land, sea and air vehicles that may be chartered for military use. 
The transportation resources are partitioned into sets of constrained and unconstrained vehicles. Time- 
varying vehicle characteristics are considered as well as the availabilities of vehicles as a function of 
time. Additionally, vehicle characteristics may vary for each of the routes in the network. 

For each origin-destination pair in the network, there are defined one or more directed paths 
consisting of sequences of nodes and arcs. The nodes correspond to the physical points in the network 
and the arcs correspond to vehicle movements between points. Each node in the network may be con- 
strained by the maximum vehicle handling throughput capacity for each time period considered in 
the planning period for the movement. Each of these directed paths is designated as a vehicle oriented 
route from an origin to a destination. 

In particular, each route may have any number of enroute stops for vehicles using the route, and 
any number of transshipment points at which the requirements being moved are transferred from one 
vehicle type to another. 

Figure 1 depicts a typical vehicle oriented route. A requirement shipped over this route travels 
from port A (origin) by train to airport B (transshipment point). At airport B, it is transferred to an air- 
craft for shipment by way of airport C (enroute stop) to airport D (transshipment point). At airport D, 
the requirement is transferred to a truck for eventual delivery to area E (destination). At each enroute 
node appropriate transit delay times (loading times, unloading times, and stopover times) are assessed 
in the determination of vehicle use and the times at which the requirement reaches each of the nodes. 

FORT A AIRPORT B AIRPORT C AIRPORT AREA E 
-*• ^TTTTr •"• T^TZ •"• 



PLANE PLANE TRUCK 

Figure 1. A typical route 



If a particular movement is to be feasible, there must be residual capacity at each node in the 
network at the times corresponding to vehicle departures from these nodes.* Additionally, sufficient 



'Destination nodes are constrained by vehicle arrivals. 



RESOURCE ALLOCATION FOR TRANSPORTATION 473 

vehicles must be available at the appropriate times to move the requirements over the several routes. 
No queueing is allowed at intermediate nodes of a route; however, fixed delays may be assessed at 
each node. 

MATHEMATICAL FORMULATION 

NOTATION 

Movement Requirements 

The set I represents the set of movement requirements. For each requirement iel, we denote its 
origin by s, and its destination by t,. Each requirement consists of a nonnegative quantity F, of per- 
sonnel or materiel to be moved. 

Time Periods 

J will denote the set of time periods allowed for a movement. Specific subsets of J will be defined 
later in the discussion of the constraint equations. 

Nodes 

The set P represents the nodes in the network. Each origin, transshipment point, enroute node, 
and destination is an element of the set P. 

Transportation Resources 

The set K is defined as the set of vehicles to be considered for a particular movement. Each vehicle 
keK is an element of the subset of constrained vehicles K\ C K or the subset of unconstrained vehicles 
K> C K. It is assumed that the subsets K\ and K> are mutually exclusive and collectively exhaustive. 

Routes 

The set N represents the set of all vehicle oriented routes in the network and the nth route is 
represented by the sequence 

[pf, (&?),p 2 n , . . .,p», (£«), . . ., (AgJj.pg], 
where 

p q 'eP = the qth node in the nth vehicle oriented route 
and 

(kq)eK= the arc representing the type of vehicle moving from p£ to p, +1 . 
A route segment is a portion of a route 

[p», (*J), . . ., (AjLj.pjJ] 

that employs the same vehicle, i.e., 

"■q "-9+1 • • • "-r-l ' 

where 

Pq and p"= transshipment points, an origin or a destination. 



474 G. BENNINGTON AND S. LUBORE 

A route, therefore, consists of a series of route segments corresponding to the sequence of vehicles 
employed on the route. 

For convenience in defining the constraint equations in the later sections, specific subsets of the 
set of routes will be defined. 

The set of routes over which requirement i may move is denoted as 

N^ C N={neN | Si = p», ti = pfi , 
where 

5; = origin for requirement i, and 
U— its destination. 
The set of routes incident with node p is denoted as 

/V (2) C N={n€N\P'q=P}- 

The set of routes over which vehicle k may travel is denoted as 

N(3)q yV={ ne /V| (k^) = k}. 

Decision Variables 

The set X represents the set of decision variables in the linear program. That portion of the ith 
movement requirement which leaves its origin in the yth time period traveling over the nth route is 
denoted as xi,j, „■ A decision variable Xi,j, n *X is defined for each requirement for each feasible shipping 
time period for each valid route. 

CONSTRAINT EQUATIONS 

The set of feasible solutions is defined by the following set of linear inequalities: 

Requirement Constraints 

y Di, fj, k xv, j, n > Fi, iel 

(1) «*i" 

where 

X^ = {xv, jt neX\n€N^', i'el}, 

i 

Di, i',j,fc= the amount of requirement i accompanying a unit of requirement V on vehicle k if shipped 
in time period j (note that £>,, ,,j, a-= 1), and 

Fi— the total amount of requirement i to be moved. 

These constraints assure that at least enough of requirement i will be delivered, any overshipment 
represents residual vehicle capability. If accompanying cargo is not allowed, these constraints are 
equalities. 

Node Constraints 

The nodes in the network may be constrained to reflect their vehicle handling capability and/or 
to reflect the outloading, throughput, or reception capability. 



RESOURCE ALLOCATION FOR TRANSPORTATION 475 

If C h „ is the maximum number of vehicle departures (vehicle arrivals for a destination) the con- 
straint equation is: 

£ f^ ^ C j>p , jej, P eP 

(2) **S ViJ ' k -" 

The set X { r } lt is defined as: 

X^ixur.neXlneNpnNM-J'cJffl, 

where 

y ( . 2) = the set of time periods for each of the shipments x,,/,,, which leave (arrive at a destina- 
nation) node p in time period j. 
Vi,j',ie,n — the capacity of the A:th type vehicle to carry the ith requirement in time period/ over the 

nth route. 
If C'i P represents a node outloading, throughput, or reception capacity the constraint is 

^ Ai,j,nXi,j', n ^ C'j, p , jeJ,peP, 

xeX" 
J.I' 

where 

Aij,„ = reflects the per unit consumption in thejth time period of node capacity corresponding to 
shipment x,,/,„. 

Vehicle Constraints 

(3) X TT^ * M i>«< &> keK > 

j.k 

The set X { ?\. is defined as 

X$= {xi,j>, n eX\neNMnWV; fejffl, 

where 

J$\ — the set of time periods for each of the shipments Xi,y,„ which utilize the &th vehicle type 
in they'th time period. 
Bi,j,k,n = the total carrying capability of each vehicle of the kth type when employed during the 
yth time period on the nth route carrying the ith requirement. In the computation of 
Bij.k.n, the average utilization rate of a vehicle can be taken into consideration. 
Mj,k = the number of vehicles of type k available for use in time period j. 

OBJECTIVE FUNCTION 

Since the movement requirements have preferred delivery dates, the objective function must 
consider both the allocation of scarce resources and the scheduling of arrivals of the requirements at 
their destinations. 

Hence, in order to deliver the requirements as close as possible to the preferred delivery dates, 
the objective function minimizes the total weighted deviations about the preferred delivery dates. 



476 

The objective function is: 



G. BENNINGTON AND S. LUBORE 



(4) 



minimize ^T Wi,j,nXi,j,ji: 



xtX 



fhere 



Wij,„ = a weighting factor. 



Weighting; Factors 

The weighting factors are functions of the deviation of the arrival time period of the ith requirement 
at its destination, if shipped in the jth time period over the rath route, from the preferred delivery date 
of the ith requirement. In general, each Wjj, „ may be an arbitrary function of the deviations consistent 
with the subjective utilities of early or late deliveries as well as any routing or vehicle preference. 

Figure 2 illustrates possible utility functions to perform the scheduling. Figure 2(A) depicts a situ- 
ation where early arrivals are always preferred over late arrivals, and the weights vary linearly with 
respect to the deviations. Figures 2(B) and 2(C) correspond to equal trade-offs between early and late 
deviations; Figure 2(B) minimizes the absolute value of the deviations, while Figure 2(C) minimizes 
the square of the deviations. Other weighting factors are easily envisioned and the particular choice 
should be made consistent with the specific movement to be analyzed. 




PREFERRED DATE 




PREFERRED DATE 




PREFERRED DATE 
FIGURE 2. Possible weighting functions to minimize deviations 



In the analysis used to date, the weighting function defined in Figure 2(A) has been used. Point B 
is the preferred delivery date for some requirement and is given a weighting factor value of one. Any 



RESOURCE ALLOCATION FOR TRANSPORTATION 



477 



deviation from point B has a weighting factor greater than one. The size of the weighting factor is con- 
sonant with the particular region of the curve within which an arrival occurs. 

Regions of Weighting Factor Curve 

Early Infeasible Region. This region covers the periods of time that the requirement is either not 
ready to be moved or cannot possibly arrive. In a particular movement, arrivals that are more than a 
fixed number of time periods prior to the preferred delivery date could be inadmissible. In this region, 
a weight would not be assigned, rather the decision variable would be suppressed. 

Early Region. This region is represented by the points from A to B, corresponding to the feasible 
early arrival periods. This region is always preferred to an arrival in the late region. The weighting 
factor is assigned a value of one at point B and is increased in steps of unity for each preceding time 
period. For example, if point A corresponds to an arrival, two time periods prior to the preferred 
delivery, the weighting factor assigned to shipments arriving in the time period corresponding to 
point A has a value of three. 

Late Region. This region, represented by the points from C to D, corresponds to a late arrival. 
Point C is assigned a weight that is one greater than any other requirement's maximum early arrival 
weight. The weights assigned to points beyond point C increase in value in unit increments. Hence, 
in the maximum possible early weight for any requirement is 40, the first late weight would be 41. In this 
manner, trade-offs between early and late arrivals are always biased towards the early end of the 
scale for the requirements to be moved in a particular problem. 

Late Infeasible Region. This region of the curve corresponds to deliveries that would arrive beyond 
the planning period for the total movement. This region may also correspond to deliveries that are in- 
admissible, since they arrive more than some fixed number of time periods after the preferred delivery 
period. In this region, the decision variables would not be assigned weights, instead, they would be 
suppressed. 

MODEL OPTIONS 
Alternative Objective Functions 

Other alternative objective functions may be utilized, depending upon the particular type of 
problem to be analyzed. 

Minimum Time Solution. In some strategic movements, it is desirable to accomplish the overall 
movement as quickly as possible, subject to the constraints described above. Figure 3 depicts a weight- 




FlGURE 3. Weighting function for minimum time 



478 G - BENNINGTON AND S. LUBORE 

ing function that will minimize the weighted delivery periods of all requirements. If a trade-off exists 
in the same arrival period between one requirement arriving early and another late, preference is given 
to the requirement that would arrive after its preferred delivery period. The discontinuity between 
points B and C is used to resolve such a trade-off. 

Minimum Cost. Rather than minimizing the deviations from a set of preferred delivery periods, it 
may be more important to minimize the total cost of the movement. In this case, the objective function 
becomes: 
(5) minimize V c-,, j, „x,-, j, „, 



XfX 

where 



Cj, i /,»=the cost of shipping a unit of the ith requirement in the 7th time period over the nth route. 

Maximum Flow. If the object is to maximize the total flow of commodities over the network subject 
to the vehicle and node capacity constraints, the objective function becomes 

(6) maximize ^ Xi.j.n- 

xt.X 

Secondary Optimization 

In some instances, it is not only desirable to attain a minimum deviation or minimum time solution, 
but also to achieve this objective at the minimum cost. This can be accomplished by first solving the 
minimum time problem and then annexing the objective function to the constraint matrix to maintain 
an optimal solution to the original problem. The linear cost function is then minimized.* This results 
in the least cost solution that has the minimum total deviation or minimum time. 

Trade-Off Bias Among Commodities 

The weighting functions considered in the case of minimum deviation or minimum time solutions 
have so far considered a unit of each commodity to be of equal value. Since in many cases, a single 
passenger is not of equal value to a ton of cargo, it is desirable to give preference to a unit of the more 
valuable commodity if competition for scarce resources exists. One way to eliminate this trade-off bias 
is to multiply each weighting function value by an additional factor. Hence, if each passenger delivered 
on time is considered to be as valuable as 400 pounds of cargo, each weighting factor for the passenger 
requirement would be reduced by one-fifth. In this manner, every five passengers delivered are equiva- 
lent to a single ton of cargo delivered. Similar factors may be introduced for other commodities. 

Priorities Among; Movement Requirements 

Each requirement can be given a priority to favor its shipment over other shipments in a given 
time period if enough resource capacity is not available to ship all competing requirements simul- 
taneously. In general, each requirement will be delivered eventually, so it is only the increments of 
weights between arrival times that are important and not thier absolute values. 

Consider the case of two requirements with the weighting functions shown in Figure 4. The higher 
priority requirement corresponds to the line segments A' B and CD'. Because the slopes of A' B and 



*There are several alternative methods of implementing a secondary optimization, but this method has yielded good results 
for the authors. 



RESOURCE ALLOCATION FOR TRANSPORTATION 



479 



CD' are twice that of AB and CD, respectively, the incremental penalty for deviating from the pre- 
ferred delivery period is greater for the higher priority requirement, and, hence, the higher priority 




FIGURE 4. Weighting function with two priority classes 

requirement will always be given preference over the lower priority requirement. In addition, the 
distances from B to C and from B' to C tend to force the late delivery of the lower priority requirement 
before that of the higher priority requirement. 

In the general case of many priority classes, the slopes are increased in proportion to the priorities, 
and the step from B to a delivery one period late is greater than the largest possible weight for the next 
lower priority. In this manner, a number of different priorities may be handled. 

Movement Requirement Mode Preference 

In some movements, a shipping mode preference (air or sea) is given for each movement require- 
ment. Three options are available in the model with respect to modal preference. 

(a) Fixed Mode — The requirement is forced to travel by the preferred mode by suppressing all of 
the decision variables for which the preferred mode of the shipment does not match that of any vehicle 
on the route. 

(b) Preferred Mode — The objective function weights are decreased by a small quantity, say 8, if 
the corresponding Xi,j, „ results in a shipment over a route that uses vehicles consistent with the pre- 
ferred mode of shipment. 

(c) Mode Indifference — The mode preference, if any, is ignored. 



Vehicle Preference by Cargo Type 

In some movements certain vehicles may be preferred for the shipment of one cargo type over 
another. The model specifies an ordered preference of vehicles for each cargo type. In a manner similiar 
to the treatment of a movement requirement mode preference, the objective function is biased to reflect 
preferences for vehicle usage. 



480 



G. BENNINGTON AND S. LUBORE 



Table I represents a typical vehicle-cargo preference relationship. In the table, the lower numerical 
values represent the higher priorities. Vehicle 1 is preferred over all other vehicles for the shipment 
of cargo type 1 and vehicle 3 is preferred for cargo type 4. The relative preference for utilization of 
vehicle 2 is the shipment of cargo type 2 first, followed by cargo types 3, 1, and 4. 

Table I. Vehicle-Cargo Preference 



Cargo Type 







1 


2 


3 


4 




1 


1 


6 


10 


10 


/ehicle 












Type 


2 


5 


1 


2 


6 




3 


10 


7 


2 


1 



Increased Aircraft Utilization 

The productivity of an aircraft type is reflected in the value of Bij, k, « in the vehicle constraint 
equation (Eqs. (3)). The value of Bi,j,k,n is based upon the long term average utilization rate of an 
aircraft over a given route. However, for short periods of time, this average utilization rate might be 
exceeded as long as the long term average value is maintained. For these periods of increased utiliza- 
tion, a higher aircraft productivity B{ y jy ki „ may be used. 

The constraint equation, Eqs. (3), can be modified for aircraft to allow for increased utilization 
over some time periods to compensate for less than average utilization in preceding time periods. 
This is accomplished by defining a variable T h u corresponding to this under utilization. This variable 
may then be carried forward into subsequent time periods where it serves to augment the available 
vehicle productivity. The modified Eqs. (3) become 



(7) 



xtX<: 



x i,f, n 



+ Tj, k - 7j_i, a- *2 M h k , jej, keKr , 



with the understanding that 7o, a = 0. 

Additionally, it is necessary to add a second set of vehicle constraint equations to insure that the 
amount of slack capacity does not exceed the total possible maximum vehicle productivity in a given 
time period. 



(8) 



2 -jfF^^Mj,*, jeJ,keKt 



If it is also desired to restrict greater than average vehicle utilization to a single time period, the 
following constraint equations are used instead of Eqs. (8): 



(9) 



n(M hk - X JT J1 ^)-T J ,^0, jeJ,keKu 



RESOURCE ALLOCATION FOR TRANSPORTATION 481 

where 

CI— a positive constant. 

Equation (9) will cause the slack variable Tj,k to vanish (due to the nonnegativity restriction on linear 
programming variables) should the maximum productivity be reached. The term in parentheses acts 
like Eq. (8), and the value of the constant, Cl, determines the maximum value of slack that may be 
carried forward. 

If vehicles are allowed to operate at their maximum utilization rate for two or more successive 
periods (if sufficient unused slack capacity is available), the variable 7j,a in Eq. (9) may be replaced 
by Tj + m-i,k, where m is the number of successive periods allowed. 

ALTERNATIVE FORMULATIONS 

Alternative formulations and modifications were investigated during the formulation of the linear 
programming model described above. Two of these warrant discussion, since they highlight future 
areas of research or give a better understanding of why some models have not been developed in spite 
of their theoretical tractability. 

Route (Arc-Chain) Formulation 

The previous formulation may be described as a route (arc-chain) formulation. The variable Xi,j, „ 
corresponds to the flow over the nth route and the constraints upon the vehicles employed on the route 
are generated by entering the variable x%,j, n into each of the appropriate vehicle constraint equations, 
using the implicit assumption that no storage is allowed at any of the transshipment nodes and that 
the next vehicle is available when required. 

Node-Arc Formulation 

A node-arc formulation defines a new variable Xi,j, Q corresponding to a shipment on the qlh vehicle 
oriented arc carrying requirement i in the 7th time period. Using this formulation, it is possible to con- 
sider in-transit storage of requirements as long as enough of each requirement i is delivered at the initial 
node of arc q before it can be transshipped over the arc. This requires additional constraint equations 
at each transshipment node m of the form: 

(10) ^ Xi,y, q ^ ]£ Xi.y.q, ieljej, 

xeB(m) xtA(m) 

where 

B{m) — the set of shipments of requirement i into node m that are available for transshipment in 
or prior to time period/, and 

A (m) = the set of shipments of requirement i out of node m in or prior to time period/. 
This is just an inequality form of the conservation of flow equations, with the excess of input over output 
taken as storage at the mth node. In addition, it is possible to place an upper bound upon the quantity 
of goods being stored. 

The node-arc formulation has two attractive features: (1) inventories at nodes are allowed;* and 
(2) the number of variables only increases linearly with the number of route segments. The distressing 



A limited capability to incorporate in-transit storage within the arc-chain formulation may be accomplished by defining 
additional variables corresponding to the storage of a requirement at a transshipment node for a fixed number of time periods. 



482 G. BENNINGTON AND S. LUBORE 

feature of this formulation is the increase in the number of constraint equations required. An ad- 
ditional constraint equation is required for each transshipment node, for each requirement, in each 
time period. The linear program must manipulate a larger basis, and may require additional auxiliary 
storage, which sharply increases the computation times. 

For the above reasons the route formulation was chosen for implementation. The additional 
capability of a node-arc formulation appears to be outweighed by the increased computational burden. 
Although the number of possible routes in a graph can literally explode as the complexity of the net- 
work increases, it does not seem to be a practical limitation, since many sets of routes can be excluded 
a priori, and the computational effort resulting from the additional variables does not seem to be 
prohibitive. 

REPOSITIONING OF VEHICLES 

The previous formulations have explicitly made assumptions about the distribution of vehicles 
to shipping nodes in the generation of the vehicle constraints. A common assumption is that a vehicle 
"cycles" back to the point from which it left. Under this assumption a vehicle used for a shipment 
is constrained from further use until it returns to the shipping node. Upon its return, it is made available 
for immediate reuse at any other shipping node in the network. For some problems this may be a serious 
error. If a ship returning from London to New York is immediately reused at San Francisco without 
a transit time delay to reposition to the West Coast, an unrealistic situation exists. 

Vehicle repositioning may be taken into consideration by defining repositioning nodes m for each 
vehicle and defining a new decision variable yj,k,m,m> corresponding to the number of empty vehicles 
of the Arth type repositioned from node m to node m' in time period j. It is necessary to add additional 
vehicle repositioning constraint equations to maintain a flow balance at each of the nodes m and m' . 

CASE 1 : m is an unloading point for the Ath vehicle type: 

(ID £ f^^^ 2 ».*.-.*. keKujeJ, 

JctHim) " x >}'< A '> " ytA(m) 

where 

B(m) — the set of shipments using the Arth vehicle type that are available for repositioning from node 

m in or prior to time period^, and 
A{m) = the set of vehicles of type A: repositioned from node m in or prior to time periody. 



CASE 2: m' is a loading point for the kth vehicle type: 

x i ,.i' , ■ 

V ■• , 



(12) ^ yj'.ft.m.m'S* 2 T/.':f'," ' keKufef, 



yeA(m') yeB(m') 

where 

A{m') = corresponds to the vehicles of type k repositioned to node m' for reuse in or prior to 
time period j, and 

B(m') — the set of shipments out of node m using vehicle type k in or prior to thejth time period. 

In either case, the slack variable in the constraint equation corresponds to the inventory of vehicles 
at node m or node m' . 

Although these additional constraints seem to enhance the model, it must be remembered that 
the model is a linear program with continuous variables. Hence, the number of vehicles repositioned 



RESOURCE ALLOCATION FOR TRANSPORTATION 



483 



is not necessarily of integral value and the added capability for repositioning may not lead to a more 
realistic solution. 

COMPUTER MODEL 

Computer programs have been written to generate the matrix elements corresponding to Eqs. (1) 
through (9) with the exception of the maximum flow option, and to generate reports resulting from the 
linear programming results. These programs are written in FORTRAN IV to be used in conjunction 
with the Control Data ALEGRO Linear Programming System on a 3600 computer or to be used with 
the IBM mathematical programming system (MPS) on 360/50 or 360/65 computer. The authors have 
found that by inserting "artificial" variables with high objective function weights into the appropriate 
constraint equations, the linear program running time is significantly reduced from the running time 
resulting from the use of the two-phase method of the revised simplex algorithm. 

Only limited computational results have been obtained to date, however Table II summarizes the 
computational experience obtained while testing the model formulation on a set of sample problems. 
All of the samples were run on an IBM 360/50 computer. The execution times do not include the time 
for matrix generation or the reformatting of the linear programming solution. 

Table II. Computational Results 



LP Parameters 




Problem c 


escription 




Problem 
number 


Rows 


Columns 


Density 
(%) 


Iterations 


Execution 
time (min) 


Time 
periods 


Require- 
ments 


Routes 


Vehicle 
types 


1 


55 


252 


6.55 


41 


0.49 


12 


6 


4 


4 


2 


55 


252 


6.55 


17 


0.79 


12 


6 


4 


4 


3 


73 


199 


5.81 


28 


0.50 


12 


12 


8 


4 


4 


85 


244 


5.61 


23 


0.45 


12 


12 


28 


4 


5 


133 


293 


3.48 


46 


0.70 


12 


12 


28 


8 


6 


263 


2924 


1.32 


136 


4.91 


17 


74 


108 


13 


7 


390 


862 


1.54 


107 


2.67 


25 


110 


35 


12 


8 


390 


1422 


2.34 


112 


4.29 


25 


110 


77 


12 



Problems 1 and 2 consider a network of only two nodes and four parallel routes between these 
nodes. This network corresponds to a movement of requirements from a single POE to a single POD 
utilizing four distinct types of vehicles. The rows of the constraint matrix are composed of 48 vehicle 
constraint equations, one per vehicle type per time period, together with six requirement constraints. 
The remaining row is the objective function. 

Problem 1 minimizes the overall closure time of the movement, and Problem 2 minimizes 
the weighted deviations of the movement requirements about their preferred delivery dates. Although 
Problem 2 requires less iterations to achieve an optimal solution, it requires more computational time. 
Behavior of this type is not atypical of the computational experience obtained with most commercial 
LP codes when applied to small problems and is most likely due to the heuristic multiple pricing rules 
or reinversion policy employed in the iterating agenda. 



484 G. BENNINGTON AND S. LUBORE 

Problem 3 is an extension of the first two problems and includes a second POE-POD pair, an en- 
route base for aircraft, and an increased number of movement requirements. In attaining the minimum 
deviation solution, the number of aircraft sorties through the enroute base was constrained, however, 
transshipment was not allowed. 

Problems 4 and 5 allow transshipment and include two origins, two POE's, one enroute base, one 
POD, and one destination. The problems differ only in the number of constrained vehicle types. One 
of the POE's and the enroute base are constrained. Problem 4 did not consider the origin to POE or 
POD to destination vehicles to be limited in number. Problem 5 placed an upper bound upon the number 
of these short haul surface vehicles available. 

Problems 6 through 8 consider larger problems that are similar in structure to Problem 5. As 
before, the number of iterations required and the computational times do not appear to obey any fixed 
relationship with respect to the problem size in terms of the parameters of the linear programs. 

REFERENCES 

[1] Dantzig, G. B., W. O. Blattner, and M. R. Rao, "Finding a Cycle in a Graph with Minimum Cost 
to Time Ratio with Application to a Ship Routing Problem," Operations Research House, Stanford 
University, California, Report 66-1 (Nov. 1966). 

[2] Fitzpatrick, G. R., J. Bracken, M. J. O'Brien, L. G. Wentling, and J. C. Whiton, "Programming the 
Procurement of Airlift and Sealift Forces: A Linear Programming Model for Analysis of the Least- 
Cost Mix of Strategic Deployment Systems," Nav. Res. Log. Quart., 14, 241-255 (1967). 

[3] Laderman, J., L. Gleiberman, and J. F. Egan, "Vessel Allocation by Linear Programming," Nav. 
Res. Log. Quart., 13, 315-320 (1966). 

[4] Lewis, R. W., E. F. Rosholdt, and W. L. Wilkinson, "A Multi-Mode Transportation Network Model," 
Nav. Res. Log. Quart., 12, 261-274 (1965). 

[5] Rao, M. R. and S. Zionts, "Allocation of Transportation Units to Alternative Trips — A Column 
Generation Scheme with Out-of-Kilter Subproblems," Operations Research, 16, 52-63 (1968). 

[6] Schwartz, N. L., "Discrete Programs for Moving Known Cargos from Origins to Destinations on 
Time at Minimum Bargeline Fleet Cost," Transportation Science, 2, 134-345 (1968). 

[7] Whiton, J., "Some Constraints on Shipping in Linear Programming Models," Nav. Res. Log. Quart., 
14, 257-260 (1967). 



INTERVAL ESTIMATION OF THE NORMAL MEAN SUBJECT TO RE 
STRICTIONS, WHEN THE VARIANCE IS KNOWN* 



Saul Blumenthal 

New York University 
Bronx, New York 



1. INTRODUCTION AND SUMMARY 

We consider here the problem of obtaining confidence interval estimates of the mean of a normal 
distribution when the mean is restricted to be nonnegative, and the variance is known. The problem 
of point estimation for truncated parameters has been considered by Katz [13] and Farrell [9]. For 
the interval estimation problem, we shall work with a loss function 

(1.1) L{dJ)=c(8 2 (-)-8 1 (-)) + {° 1 ™>^°*w, 

where the interval estimate /( • ) = [8i( • ), 82 ( • )] is closed with upper and lower end points 82 and 
81, respectively, and 6 is the mean being estimated. The loss function (1.1) was used by Joshi [12] in 
demonstrating admissibility of the standard interval estimate for the normal mean in the unrestricted 
case. Dudewicz [8] considers this loss function in constructing interval estimates for ranked means. 
For the unrestricted mean problem, Ferguson [11, p. 184] mentions this loss function, and variations 
on it have been used by Aitchison and Dunsmore [1] where the expected size of the error rather than 
the error probability appears in the risk function, and by Konijn [14 and 15] where the risk function 
used the noncoverage probability and the probability of covering incorrect values. 

The problem of constructing confidence intervals for restricted parameters has been considered 
in some detail for the logistic distribution by Bartholomew [3] and has been discussed in general by 
Stein [18] and Bartholomew [2]. 

In Section 2, we obtain Bayes estimates with respect to folded normal prior distributions, and 
demonstrate the admissibility of the generalized Bayes interval obtained by taking a uniform prior 
distribution on (0, °°). The admissibility is subject to the same equivalence class restrictions as in 
Joshi [12] and the reader is referred to that paper for a full discussion of admissibility for interval 
estimators. The risk functions of the generalized Bayes estimator and a natural truncated estimator 
are discussed in Section 3, along with the question of minimaxity. The truncated estimator is shown 
to be minimax. In Section 4, we consider sequential stopping rules where sampling cost is linear and 
we find rules which are asymptotically pointwise optimal, and asymptotically optimal Bayes rules in 
the sense discussed by Bickel and Yahav [6]. In section 5, we replace c by h{6) in (1.1) and examine 
how the length of the resulting Bayes intervals is related to h{6). 



*Research supported by National Science Foundation Grants GP-7024 and GP-7399, School of Engineering and Science, 
New York University. 

485 



486 S - BLUMENTHAL 

2. BAYES INTERVALS AND ADMISSIBILITY 

The observations A\, . . . X n are independent with common normal distribution N(6, v 2 ), with 
known variance v l . The averaged is then N(6, (v 2 /n)). To avoid continually writing (v 2 /n) , we define 

(2.1) a 2 = v*ln, Y=X. 
Corresponding to the loss function (1.1), we have the risk function 

(2.2) R(d,I)=cEe(d,(Y)-d 1 (Y)) + l-P (8 l (Y)^d^8 2 (Y)). 

To generate Bayes estimates, we let 6 have a normal distribution with mean and variance t 2 , confined 
to 3=0, i.e., 

(2.3) g T (d) = (2l7TT 2 )^e- e2 l^ 2 6^0 

= 6><0. 

The posterior density of 8 given Y is 

— — (e-yy--)- 

(2.4) gx(0|y) = {<P(y/o-y)}-'{y 2 /277o- 2 }'/ 2 e 2 - 2 , >j 

= 0<O, 

where 

(2.5) y 2 =l + (o- 2 /r 2 ). 

For the purpose of finding the Bayes estimation interval, we characterize interval estimators by 
a function £(0, y) , where 

(2.6) £(d,y) = li£8 1 (y)^d^8 2 (y) 

= otherwise. 

LEMMA 2.1: The Bayes estimation interval for prior densities (2.3), loss function (1.1) is given by 

(2.7) 8,(F) = max [0, M(y, Y) -D(y, Y)] 

8 2 (Y) = max [0,M(y, Y)+D(y, Y)], 
where 

(2.8) M(y, F) = F/y 2 
and 

(2.9) D(y, F) = {(2o- 2 /y 2 ) log [(y 2 /277o- 2 ) '/^(F/o-y)]}'/ 2 , 
provided that 



INTERVAL ESTIMATION OF NORMAL MEANS 487 

(2.10) c< (7 2 /2ir<r 2 ) 1 ' 2 [«l>(I7<ry)]- 1 . 

If (2.10) is violated, the interval shrinks to a point which can be chosen arbitrarily as any nonnegative 
number. 

PROOF: Using (2.6), we can write 

£L(0,/) = 1+[ X g T (d)dd f* [c(8 2 (y)-8,(y))-£(0,y)]/(y|0)rf>r 



(2.11) 



= l+r My)dy[ X [c-g 7 (d\y)]t(0,y)dd, 

Jy=-zc Jti = o 



where Fubini's Theorem justifies changing the order of integration and the conditions for using the 
theorem are verified as in Joshi [12, Lemma 5.1], and where 

(2.12) My) = (2/7ry 2 T 2 )'/ 2 cp(y/o-y)e^ 2 / 2 ^ 2 >. 

The expression in (2.11) will be minimized by minimizing the inner integral on the second line for 
each y, namely by taking 

(2.13) £r(e,y) = lifg T (d\y)^c 

= otherwise. 

Since g r (0\y) is unimodal and monotone on either side of the mode, it is seen that (2.13) and (2.7) are 
equivalent. Condition (2.10) is needed to guarantee that g T (6\y) ^ c for some 6. This completes the 
proof. 

NOTE: Condition (2.10) assures that the cost due to length does not dominate the cost due to 
nonconverage. Since any point estimate has zero length and zero coverage probability, the degeneracy 
in the location arises. Because condition (2.10) depends on F, it is possible for small F values to lead to 
nondegenerate intervals, while larger F values lead to the degenerate situation. To avoid this, it seems 
desirable to add a stronger condition assuring nondegenerate intervals for all observations. Noting that 

(l^Tro- 2 ) 1 ' 2 ^ (y 2 /27ro- 2 ) 1/2 ^ (y 2 /27ro- 2 ) 1/2 [$(y/o-y)] "'■ 

we can eliminate the dependence on F by using (y/2Tro~ 2 ) 1/2 as a bound for c. This still leaves the 
uniqueness dependent on t (through y), and when we take limits as t— » °°, we will want c such that a 
nondegenerate solution holds for allr. This suggests imposing the condition 

(2.14) c< (1/27TO- 2 )'/ 2 , 

which we shall do henceforth. 

Even under (2.10), it is possible to have a degenerate interval at zero because M + D may be 
negative. The requirement that M + D > 0, all F reduces (for F < 0) to 

(2.15) (ca V2Wy)<P(y/o-y) < e - ( ^ 2 ' r ^ 2) . 



488 s - BLUMENTHAL 

Since it is always true that <i>(x) < e~ x2 l' 2 (x =s 0), it is seen that the stronger condition (2.14) assures 
M + D > 0, so that no point estimates will be incurred. Condition (2.10) alone does not insure (2.15). 

We summarize the above remarks in Corollary 2.1. 

COROLLARY 2.1: The Bayes estimation interval I r for prior densities (2.3), loss (1.1) and (2.14) 
are nondegenerate for all Y and t, with 8i(Y) given by (2.7), and 

(2.16) 8 2 (Y) = M(y, Y)+D(y, Y) > 0. 

We shall now study the generalized Bayes rule obtained by taking a uniform distribution on 
(0, °°). It is obtainable from the previous results as indicated below. We denote it by/ p . 

COROLLARY 2.2: Under the conditions of Corollary 2.1, the uniform generalized Bayes interval 
estimate is obtained by letting t— » °°, and is given by 

(2.17) 8,(r) = max [0, Y-D(Y)] 

8 2 (Y) = Y+D(Y), 

where 

(2.18) D(Y) = {-2o- 2 log [c<TV2^-4>(y/o-)]}" 2 . 

Next, we study the risk function of the Bayes estimator. 
LEMMA 2.2: The risk function of V is given for d ^ by 

(2.19) R(6, I T ) = dy^[6<S>(yu(y) - (0/cr)) - (cr/ V^e-./^M-Wa))*] 

+ 2V2(o-/y) J* [-iogcV2^-(o-/y)cD(y- 1 (y+(e/o-)))] 1/ V(y)^y 

r- f[yu(y)-(eio-)) . 1 

- V2(<r/y) J [-logcV2^(o-/y)<P(y- 1 (y+(^/cr))] 1/ V(y)^y] 

+ l-{<P(yu i (y, 6)-(dla))-<P(yu 1 (y, 0)-(0/<r))}, 

where Ui(y, 6) and u 2 (y, 6) are the two solutions of 

(2.20) <P(*) = (ylca)ip(x - (yd/a) ) 

and u(y) = u 2 (y, 0) is the unique positive solution of (2.20) for = 0. 
PROOF: Since D(y, Y) > a.s., it is immediately obtained that 

(Y/y 2 ) ^ D(y, Y)(i.e., 8,(7) > in (2.7)) if and only if <D(y/y<r) 2* (ylc<r)<p(Yly<r), Y ^ 0. 

Furthermore, ^(u) is a strictly increasing function of u while <p(u) is a strictly decreasing function 
on [0, °°]. Hence, 

(F/y») 5= D(y, Y)<^>(Ylya) > u(y). 
Therefore, 



INTERVAL ESTIMATION OF NORMAL MEANS 489 

Ee{82(Y)-8 l (Y)}=2E e {D(y, Y)I [y<TU(y)M {Y)} + E e {{M{y, Y) + D(y, Y))h- m .y„ m (Y)} 
2V2cr 



y 



f ™ , / c V2ttct , ( u 



cp( ~ + _x_ 
y cry 



1/2 

<p(u)du 



y J-oc ly- y- L \ y \y cry//j J 

It may be immediately verified that this yields the terms in (2.19) which are multiplied by c. 

Write the noncoverage probability as P e [8 2 (Y) < 0] + P e [8i(Y) > 5= 0]. The first of these can be 
written as 

P{M + D < 0} = P{D 2 < {S-M)\ d> M} =P{cp(y/cry) > (ylc(r)<p((YI<ry) - (y0/cr)), Y < y 2 0} 

and similarly for the other probability. Combining these, gives as the coverage probability 

(2 . 21) ^ r(y ^r )1 <(OT/ 4 

Note that Eq. (2.20) does have two solutions when > 0, since (2.14) assures [<p(x— (yd/a) )/<t>(x)] 
> (ca/y) atx= (y0/cr) , and lim [<p(x- (ydla))l<&{x)] = lim [<p{x - (ydl<r))l<&(x)] = 0. The probp 
bility statement in (2.19) follows from (2.21) and straightforward manipulation. 

NOTE: The risk function of I p is obtained by setting y = 1 in (2.19). 

We shall now examine the difference between the Bayes risks of I T and I p . Let 

(2.22) r( T ,I)= j R(d, I)gr(6)dd 

denote the Bayes risk of an estimation interval /. In order to bound this difference, we must first look 
at the functions in(y, 6), i = 1, 2 defined by (2.20). 

LEMMA 2.3: The solutions Ui(y, 9) of (2.20) satisfy for all 0, 

(2.23) My, 0) ~ (70/°-) |> ^logty/co-V^)]" 2 , £=1,2. 

Also \iii(y, 0) — (y0/cr)| is monotone decreasing as increases, with a limit of [2 log (y/ccrV27r)] 1/2 . 
The functions Ui(y, 0) are monotone increasing in 0(i = 1, 2). For all sufficiently large, (0 > 0*, say) 

(2.24) I My, 0) - (yd/o-) | < [2 log (y/co-V^)] 1 / 2 ^ [(2ccr/y)e- ,/2(v8M ] 1/2 . 
Finally, the functions u(y) = u>(y, 0) satisfy for all t i? 1, 

(2.25) < u(y) - u(l) < (cr/2c) (1 + (cr 2 /2) ) (1/r 2 ) 

The proof of this and the following lemma will be given in the Appendix. 

LEMMA 2.4: The difference in Bayes risks between / p and P is 0(t" 2 ), i.e., for some B 

(2.26) T 2 [r(T, I p )-r(r, P)] < B, for all t. 

Not only is the left side of (2.26) bounded, but it also has a limit as t — * °°, which we state below. 
COROLLARY 2.3: The limit as t -> °° of the left side of (2.26) exists and is given by 



490 s - BLUMENTHAL 

(2.27) Jim r 2 [r(r, /") - r{r, V) ] = ca4 (I + V2) [- log ca V2^] " 2 + ( Wo" 1 ) [~ lo S c °" V§W]-W 

The proof of this Corollary is given in the Appendix. 
Next, we establish our main result, 
THEOREM 2.1: Let I p be the interval estimator (2.17), and /' any other estimator such that 

(2.28) R(d, I')^R(dJf) for all ^ 0. 
Then /' is equivalent to I p , i.e., 

tj'(x, 6) =£»(%, 6) for almost all (x, 6)eRx (0,oo), 

where tj(x, 6) is defined by (2.6). 

NOTE: With respect to the loss function (1.1), Theorem 2.1 establishes the admissibility of I p 
among all confidence set estimators up to an equivalence class of procedures. Without the qualification 
of "up to an equivalence class," or some restriction on the class of confidence set procedures, no 
interval estimator can be admissible. This is due to the discontinuous nature of the loss function (1.1) 
so that any estimator / can be modified on a set of two-dimensional Lebesgue measure zero to im- 
prove its coverage probability on a set of 6 values having measure zero. (Joshi [12] discusses this in 
some detail.) Theorem 2.1 shows that if (2.28) holds, strict inequality can hold only on a set of 6 values 
having Lebesgue measure zero, i.e., "almost" (or "weak") admissibility, but the conclusion of equiva- 
lence of the estimators is a stronger one than the conclusion only of "almost" admissibility. 

If the class of procedures is restricted to those which yield closed intervals for every observation, 
then Theorem 2.1 establishes admissibility without any qualifications regarding equivalence classes. 
Further, Theorem 2.1 establishes the "strong admissibility" of I p according to the definition used 
by Joshi [12, Section 7]. 

PROOF OF THEOREM 2.1: We follow the line of Joshi's [12] argument, making appropriate 
modifications. 



Let 

(2.29) 
v/here 



U<0(y) = i + £" [c-g x (e\y)]e n (y, d)dd = E{L(0,I^)\Y = y}(i=l,p), 



(2.30) g x (0|y) = {cp(y/o-)}->{277o- 2 }- 1/2 e-i< fl -'" i . 
Clearly U (,) (y) is minimized when 

(2.31) £ (i) (y, 8) = lforg m (0\y)&c 

= for gA&\y)<c, 

i.e., for / <p) . Thus, either 

(I) U (I) (y) = U (p) (y) almost ally in R 
or 

(II) U (ll (y) > U <p) (y) for subset S of/? with positive measure. 



INTERVAL ESTIMATION OF NORMAL MEANS 491 

If (I), then the desired equivalence can be established as on page 1065 of Joshi [12]. We show that II 
leads to a contradiction. 

Note that II implies that there exists a positive number k such that for some a 

(2.32) J <P(y/o-)[U <1, (y)-U , " ) (y)]^y=A. 

We now show as in Joshi [12] that A: = 0, achieving the desired contradiction. Using (2.11) we see that 

(2.33) £{L(6>, /<>>)-L(0, /<">)} = Wiry^ 2 ) 1 ' 2 f" G r (y)dy, 
where 

(2.34) G 7 (y) = e-<" 2 ^^»cp(y/o-y)|[c(8<>»(y)-Si"(y))-| () X ^(0|y)^"(y, d)dd] 

- [c(^»(y)-«W(y)) - f * *■(* | y)^>(y. 0)d0]}, 

and g T (d | y) is given by (2.4). The limit asr^" of g T (0 | y) is gx(0 | y) of (2.30), and the Helley Bray 
lemma justifies taking the limit inside the integrals of (2.34) so that by (2.29) we find 

(2.35) limG T (y)=<P(y/o-)[U (1| (y)-U ( P»(y)]. 
Next we justify the interchange of limit and integration 

(2.36) lim J ' G T {y)dy= \" lim G 7 {y)dy=k 

t— »°» J —0 J —n T •*" 

by showing that | G T (x) \ < G(x), in particular 

(2.37) | G T (y) | < G(y)=c{(8^(y) -8\»(y) + (8^>(y) -8^(y))}+2. 
By definition (2.17), [8i p) (y) ~8[ p) (y)] is bounded uniformly. Now consider 

(2.38) c \ " [8<»(y) -a<»(y)]«fy=c f " [8[y(y) -8<'>(y)] — L— e-W*"*. 

J -a ' J -a VlTT a 

• (V5iro-)e ( » 2 / 2tr ^y 
go- V2W'' 2 ^ 2 >£ w=0 {c(8< 1, (y)-8 , 1 1) (y))} 

go- V2^fe ( " 2/2<r2) «(0, /c>)go- v^e ( " 2/2cr2) /?(0, /<">), 

the last inequality from (2.28). Thus G(x) is integrable over (—a, a), and from (2.36) we conclude 
that given e > 0, there is a To s.t. for all t > To 

(2.39) f G T (y)dy>k-e. 

J -a 

Now we observe that from the construction of a Bayes procedure, 

(2.40) G T (y) ^e- ( « 2 /^ 2 »cp(y/o-y)|rc(8|(y)-8[(y)) - [ * g T (0 | y)f'(y, 6>)</0 

c(s«">(y) -8</"(y)) - J * *r(0 1 y)£ (p, (y, 8)do} = D T {y) g 0. 



492 S. BLUMENTHAL 

Thus, 

(2.41) (2/7ryV)'/^|"V|^G 7 (y)rf y ^(2/7ryV)'^(T"''+r"W(y)^ 

i? (2/7ryV)>/ 2 i D T (y)dy=r(r, I T )-r(r, I"). 

J — zc 

By using (2.34), (2.39), and (2.41) and assuming t„ > 1, 

(2.42) r(r, /'") -r(r, /p) ^ T ^ ( ( *~* } 2) ~ [r(r, /p) -r(r, !>)]. 
By (2.28), r(r, / (,) ) -r(r, / p ) < 0, so that 

(2.43) k =i W( ^"°" 2) r[r(r, /p) - r(r, L)] + 6. 

From Lemma 2.3 and the arbitrary nature of e, it follows that A: = and the proof of the Theorem is 
complete. 

3. RISK OF Ip AND MINIMAX ESTIMATION. 

We shall now investigate the behavior of the uniform generalized Bayes interval as a function of the 
observed mean Y, and the risk function of I p . We shall introduce a "truncated" estimate and show that 
it is minimax. It has not been possible to demonstrate the (conjectured) minimaxity of I p . 

LEMMA 3.1: The length of the interval I p is a unimodal function of Y with maximum of 

(3.1) 2{-2cr 2 log [co-V2^D(a(l))]}" 2 

at Y=au(l), where u(l)(>0) is given by (2.20) withy= 1, = 0. 
The length has a limit of 



(3.2) 2{-2o- 2 log [CO-V277]} 1 / 2 

as Y increases, and behaves like 

(3-3) (o- 2 /|F|) log \Y\ 

as Y decreases to (— °°). Both limits, 8i(y), d-ziy), are monotone increasing functions of Y, and 62(F) > 0, 
ally. 

PROOF: From the characterization (2.30) and (2.31), we see that the endpoints 8j(y) (£=T, 2) 
of I p satisfy the equation 

(3.4) {2tto- 2 } ,l2 e~^ »-y)' = c4> {yja), 

so that (8 — y) decreases monotonely as y increases. The length is 2(8 — y) if y^cru(l), where u{\) 
is defined by (2.20), and the length is just 8 2 (the larger root) if y < cru(l). The assertion that 8» is 
monotone increasing with y is equivalent to the assertion that the equation 



INTERVAL ESTIMATION OF NORMAL MEANS 493 

(3.5) &(x)-(ca)<p{a-x)=0, a >0 

has only one solution x with x < a. The latter statement follows by noting that (3.5) is positive and 
increasing for x<a — y 2 (say), and is decreasing for a — y> < x < a — y-i (say), then is increasing for 
a — y\ < x, and is negative at x = a. 

The monotonicity of 8 t is easily verified by implicit differentiation of (3.4). Putting 3>=1 in (3.4) 
gives (3.2), and using the tail approximations for (1 — <1> (jc) ) when x is large (Feller [10, p. 166]) gives 
(3.3). This completes the proof. 

Next we look at some aspects of the behavior of the risk function R(d, I p ). Let 7° denote the in- 
terval with end points Y±{ — 2cr 2 log [ccr V2rr] } 1/2 which is the Pitman type estimation interval for 
unrestricted 6 and was shown by Joshi [12] to be admissible and minimax with constant risk 

(3.6) 7? = 7?(6>, 7°)=2[c{-2c7 2 log [ca V2ir] }" 2 + 4>(- {-2 log [ca VYtt\ J 1 ' 2 )]. 

Comparing the risks of the two estimators, we obtain: 

LEMMA 3.2: The risk function R (6, I p ) is discontinuous at 0=0 with 

R(0-, I*) = 1 + c\2V2<t J [- log ca V¥n-<P(y)yi 2 <p(y)dy 

(3.7) -Via!" 1 [-\ogcaV2^<$>(y)yi 2 <p{y)dy-(T<p(u(l))\ 

= R + ®(u(l))[l- {ca)(ca+ u(l))] >1 > R. 

(3.8) 7?(0, Ip)=R(Q+,Ip)=R(Q-, /p)-<D(u(1)) = R - (ca)<P(u(l))[ca+ u(l)] < R, 
and 

(3.9) lim R(0,I p )=R. 

0- +0C 

PROOF: The expected length of I p is a continuous function of 6, but the coverage probability 
drops to for 6 < 0. This accounts for the first part of (3.8) and (3.7) which represents R(0— , I p ) as 
unity plus the expected length at zero. The other equality of (3.7) comes from using the change of 
variables v={— 2 log ccrV27r<l>(y)} 1/2 to evaluate the integrals. The first inequality comes from the 
positive character of expected length. The second comes from the fact that any point estimator has 
constant risk of unity, and 7° being admissible must have smaller risk. By taking limits inside integrals 
in (2.19), and using Lemma 2.3, (3.9) is verified, completing the proof. 

NOTE: The argument used to establish the first inequality in (3.7) can be extended easily to show 

(3.10) R(0, I") > 1 >7? all0<O. 

The preceding lemma suggests very strongly that the risk function of 7 P is =7? for all i? 0. 
However, we have been unsuccessful in attempting to prove this boundedness. The motivation for 
desiring it is the fact that any estimator whose risk is so bounded is minimax. This minimax property 
can be established by essentially the same proof as is used in Blumenthal and Cohen [7, Theorem 3.1]. 



494 S. BLUMENTHAL 

One estimator which is minimax is the obvious "truncated" estimator 7 7 based on 7°, namely 

(3.11) 8f(y) = max [0, Y-D]; 87(F) = max [0, Y+D] 
where 

(3.12) D= {-2a 2 log [co-V^]} 1 ' 2 , 

with the interval becoming a point for Y <— D. Clearly, for > 0, del <-> 0e7 r so that both intervals 
have the same coverage probability and I T is never longer than 7°. Thus R(d, I T ) = R, all and I T 
is minimax. The risk function of I T is given by 

(3.13) R(6, I T ) = ca{D[l-(t>(D-(dl(T))+<i>(D+(dla))]+(T[e(D+ (8l<r)) -<p(D-(9l<r))] 

+ e[&{D+{dlo-)) + &{D+(dl<r))-l]} + \-[Q(D)-(l-h(0, 0))<P(-7))] 

= R-ca{[®(D-(dl<r)) + ®(D+(dl<r))][D-6] 

+ d-a[<p(D+ (dla))-<p(D-(dl&))]-h(0, 0)®(-D), 
where 

LU otherwise. 

From (3.13), we see that lim R(8, I T ) = R. 

Although I T is minimax, it is not likely that it is admissible since its length is not a continuous function 
of y. Clearly, generalized Bayes intervals will have continuous lengths so that I T would not be in the 
generalized Bayes class. If this class is complete as is the case for point estimation (see Sacks [17]), 
then I T could not be admissible. 

We close this section by taking a brief look at the coverage probability of I p as a function of 6. 

LEMMA 3.3: The coverage probability of l p given by 

(3.14) P(0) =<1>(M1, 6) - (0/o-)) -<D(a,(l, 0) ~ {filar)) % 

is monotone decreasing in 6 with maximum (for 5* 0) of <I>(u(l)) and minimum of (<i>(L) —3>(— /_,)), 

where the u's are given by (2.20) and 

i 

(3.15) L={-2 log [co-V2^]} 112 . 

PROOF: The proof follows directly from using Lemma 2.3 in (3.14) and (3.14) comes directly 
from Lemma 2.2. 

To compare with the case of unrestricted 0, let the coverage probability in that case be (1 — a). 
Then! satisfies <t>(L) — <£>(—L) = (1 -a) , and from (3.15), ccr = (p(L). Thus, from (2.20), u(l) satisfies 

(3.16) Ma)/<D(«))=<D(L) 

andP(0) decreases from <P(u(l)) to (1 — a) as increases. From (3.15) we see that u(l) > L, and if 
a is small so that L is large, 4>(u) ~ 1 and u(l) =L, giving max P(0) = 7^(0) =<t>(L) = (1 — (a/2)). 
(In fact, P(0) > (1— (a/2)).) Also, from above we see that <p(u) =$>(u)(p{L) ><S>(L)(p(L) so that 



INTERVAL ESTIMATION OF NORMAL MEANS 



495 



u(l) < u* where <p(u*) =<$>(L)<p(L) = (1 - (a/2))<p(L), and P(0) <<P(u*). Numerically, we consider 
two examples, a = 0.05 and 0.10. 



a 


D 


1 - (a/2) 


P(0) = «D(u(l)) 


4>(u*) 


0.05 


1.96 


0.975 


0.9755 


0.976 


0.10 


1.64 


0.95 


0.952 


0.953 



It seems evident that for the usual ranges of a values, (1 — (a/2) ) is quite sufficient as an approximation 
toP(0). We might also note that the above method of bounding P(0) suggests an iterative procedure 
for solving (3.15), namely define u, + i by 

<p(u i + i )=<b(ui)<p(D), £ = 0,1, . - . (tio=L). 

This gives a sequence whose even (odd) numbered terms are less (greater) than the desired solution 
and whose even (odd) terms are monotone increasing (decreasing) to that the sequence converges 
to u(l). 

4. THE POSTERIOR BAYES RISK AND SEQUENTIAL PROCEDURES 

In this section, we shall consider the posterior expected loss given the observations for the Bayes 
procedures of Section 2. We shall study a sequential rule which is in effect a curtailment of the fixed 
sample rule and which guarantees a uniform bound on R(6, 1). Also for linear sampling cost, a sequen- 
tial Bayes rule which is Asymptotically Pointwise Optimal (A. P.O.) in the sense of Bickel and Yahav 
[4-6], and asymptotically optimal will be given. 

The posterior expected loss for the Bayes procedure is given by the minimum value of the inner 
integral of (2.11), namely 

(4.1) r(Y) = r+(Y) = l + 2cD(y,Y)-[<P(M(y,Y)lC)]->[2<P(D(y,Y)IO-I] iiY^ayu(y) 
r(Y) = r-(Y) = l+c[M(y, Y) +D(y, Y)]- [4>(M(y, F) /£)]-'[<!> (D(y, ¥)IQ 

-cp(-M(y, Y)IQ] ify^o-ya(y), 



^here 



(4.2) 



£2 = or*/y 2 



and u(y) is defined by (2.20), y by (2.5), M(y, Y) by (2.8), and D(y, Y) by (2.9). We shall write simply 
M and D where this causes no confusion. 

By straightforward differentiation, we find that r + (Y) is a monotone increasing function of Y with 
a maximum of 



(4.3) 



2[c£{ -2 log c£V27r}</2 + <D(-{-2 log c £ V2ir}>/ 2 )] 



The square bracketed term in (4.3) goes monotonely to zero as £ — * (i.e. as the number of observations 
n, increases). The function r~(Y) is also monotone increasing in Y with minimum of at Y= — °°. Thus 
the posterior risk r(Y) is monotone in Y with maximum (4.3). 



496 s - BLUMENTHAL 

If a Bayes confidence interval is sought in the case of unrestricted 6(—°° < 6 < °°) relative to the 
prior density (2.3) also on the whole line, with loss (1.1) the resulting estimate M(y, Y) ±D(y, °°) has 
constant posterior expected loss (4.3). Thus if a sample size no is chosen so that this interval estimate 
will have Bayes risk ro, the same risk can be achieved in the case of restricted 6 by a sequential rule 
which stops as soon as r(Y) < ro, and the sequential rule will never take more than no observations. 

We shall now consider the stochastic limit as n increases of the random variable r(Y). From this, 
it will be possible to construct A.P.O. sequential stopping rules. 

LEMMA 4.1: Let the random variable r(Y) be the posterior Bayes risk for interval I 7 . Then 

(4.4a) lim (n/log n)" 2 r(F) =0 if 6 < 0, w.p.l, 

(4.4b) lim (ra/log n) 1/2 r(F) =2c*> if 6 > 0, w.p.l, 

n— * o° 

(4.4c) lim {nl\ognyi-r{Y)=cv if = 0, w.p.l. 

PROOF: From (4.1), (4.2), the definitions (2.1), (2.5), (2.8), (2.9), and the tail approximation to the 
normal, we find 

(4.5a) lim (rc/log n) 1/2 r+(*) =2c^, x > ( (log n)/n) 1/2 , 

n —* oo 

(4.5b) lim (n/log n) 1/2 r-(*) =0, x < 0, x fixed, 

(4.5c) lim (nl\ogn) i l*r-(x)=cv[(a 2 + l) ll2 -a], x = ~av{ (log n)/n) J / 2 , a > 0, 
(4.5d) lim (ra/log n) 1/2 r"U) =ci^(a+ 1), x = av((\ogn)ln) l l 2 , a > 0. 

From the definition (2.20), we find 

(4.6) ayu(y) = v V(log n)\n+o ( V(log n)jn) as n increases. 

Using the fact that Y^d w.p.l as n increases, the fact that r(Y) = r + {Y) if Y > ayu{y) , (=r~(Y) 
otherwise), along with (4.5a), (4.5b) and (4.6) we see that w.p.l r(Y) = r + (Y) if 6 > 0, (= /- (Y) if 6 < 0) , 
and that (4.4a) and (4.4b) hold. 

For 6 = 0, we use the law of the iterated logarithm to conclude that w.p.l, |F| < 2V(2 log log n)/n, 
so that w.p.l, (n/log n) ll2 r(Y) is given by either (4.5c) if F< or by (4.5d) if Y > 0, with a = 0. Both 
expressions agree in that case, giving (4.4c), and completing the proof. 
It is immediately evident from the proof above that 

(4.7) sup (n/log n) "*E(r (Y) ) < oo, 

n 

where expectation is with respect to the marginal density / 7 (y). 

We now consider the construction of sequential Bayes stopping rules where the cost per observa- 
tion is a constant, Co, with stopping cost at stage n defined as 

(4.8) Con + L(6,I) 

with L{6, I) given by (1.1). Proceeding as in Bickel and Yahav (1968), we define the stopping rule: stop 
for the first n such that 



INTERVAL ESTIMATION OF NORMAL MEANS 497 

(4.9) r(K)[l-((nlog(n + l))/(/i + l) logn)^]^C 

which is obtained by writing r(Y) as ((log n)/n) U2 V, and finding then which minimizes (nC + r(Y)). 
From Lemma 4.1 and (4.7), it is clear that the conditions of Theorems 2.1 and 3.1 of Bickel and Yahav 
[6] hold if (n/log n) 1/2 is read in place of n®. This substitution causes no difficulties in the proofs of these 
theorems. (The requirement that V > is met since none of our distributions put any mass on 6 < 0.) 
Expanding the expression in (4.9) we obtain the approximate rule: 

(4.10) stop as soon as (r(Y)l2N) == C . 

Summarizing the conclusions indicated above, we have 

THEOREM 4.1: For the loss (4.8), the stopping rules (4.9) and (4.10) are A.P.O. and asymptotically 
optimal in the sense of Bickel and Yahav [6]. 

5. COST DUE TO LENGTH WEIGHTED BY A FUNCTION OF THE MEAN 

In this section, we take a brief look at the intervals which result from using a loss function of 
the form 



(5.1) U0,I) = h(d)(8 2 (Y)-8 l (Y)) + { 



if 8 l (Y)^d^8 2 (Y), 

1 otherwise 



where the relative importance of the length of the interval depends on the size of the unknown mean 
being estimated, and the constant c of (1.1) is absorbed in h{6). Proceeding as in Lemma 2.1, we find 
the Bayes solutions to have the form 

(5.2) f(0,y) = l i{gA0\y)^E[h(d)\y], 

— otherwise 
where 

(5.3) £[M0)|y]=r h(0)g T (0\y)de, 

J 0=0 

and g T {6\y) is given by (2.4). This leads to intervals of the form 

8,(y) = max [0, M(y, Y)+G(y, Y)] 
(5.4) 

S 2 (F) = max[0,A/(y,F)+G(y,y)], 

where M(y, Y) is given by (2.8), and 

(5.5) G(y, F) = {(2cx 2 /y 2 ) log [(y 2 /27ro- 2 )" 2 /£(M0) \Y)<P(Y\ay)]}^. 

For a solution of (5.2) to exist, some restriction on h{6) is needed, such as 

sup h(d) ^ (2tto- 2 )- 1/2 , 

O=S0<x 

which is quite severe, but which also assures that 82(F) > 0, all Y. As we vary h(d), we can obtain 
quite different properties for the resulting interval, the effect of h(6) entering through the half-length 

G(y,Y). 



498 S - BLUMENTHAL 

The requirement that the 8, be monotone can be written as 

G(y,Y)-M(y, Y) + [B(Y, 6h(6))/B(Y, h(d))] -0 for 8,, 
(5.6) 

G(y, Y)+M(y, Y)-[B(Y, dh(d))/B(Y, h(d))] >0 for 8,, 

where for any function r(9), 

(5.7) B(Y, r(d))= I* r(e)e L <y 2 l 2 « 2 " e -"iy< Y » 2 dd. 

Jo 

If it is desired to chose h(d) so that the length of the interval behaves for large M approximately like 
a given function /(M), h(6) must at least approximately satisfy 

(5.8) B(Y,h(d)) = e -iy^W). 
Note that the condition 8 2 (Y) > can be written 

(5.9) B(Y, h(6)) < e-W"^' 2 for Y < 0. 

For/(M) =kM (A. any constant), (5.8) and (5.9) may conflict. From (5.7) we expect as a first approxima- 
tion that for large M, B(Y, h{d)) = h(M), so that in view of (5.8), we would try 

(5.10) M0)«e-W 

for some A > 0, in order to have length approximately/(M) for large M. WhenM is negative, the length 
is not 2C(y, Y) but M + G and will not necessarily be of the form/(M). In fact, asM— » — °o we generally 
want (M + G) — » 0, regardless of the form of the length for larger M. 

Paulson [16] suggests a rationale for desiring a half-length of order \M(k < 1) for large M and ap- 
proaching for small M in a problem where 6 represents the mean of a difference between two normal 
variables. We shall construct here an interval having this property by choosing h(d) =ce~ A92 . The 
resulting B(Y, h(6)) is 

(5.11) c(27r^/l + Z4C 2 ) 1/2 e- ,A,2/(1 + 2 ^ 2, cp(M/^Vl + 2^^), 
where 

(5.12) £ 2 = cr 2 /y 2 . 
Since 

(5.13) G(y, Y) = {-2C 2 log B(Y, h(0))Y'\ 

we see that for large M , the interval length is to a first order approximation 

(5.14) 2(2A^I(l + 2AC 2 )) m M. 

Using the tail approximation for <t>(—x) as x increases, we find that for K— » — <» 

(5.15) 8AY) = CH\og \M(y, Y)\)l\M(y, F)|+0(1/|F|) 



INTERVAL ESTIMATION OF NORMAL MEANS 499 

which is the same as for h(6) = c in (3.3). From (5.14), we see that A can be chosen as 

\2/2£ 2 (1-a 2 ). 
Rewriting B{Y, h(0)) as 

(5.16) e-*Wl [ X fc(0)e-««»+*MV«Wd0 

for M < 0, and noting that the integral in (5.16) is (£'-/i(0)/|M| ) plus terms of smaller order asM-* — oc , 
we see from (5.13) that 8>(Y) will always have the order of magnitude indicated in (5.15) as Y —* — °°. 
Sequential procedures can be developed from the posterior expected loss function for these loss 
functions just as we did in section 4 for the loss function (1.1). 

APPENDIX 

Proofs for Section 2 

In this Appendix, we supply the proofs of Lemmas 2.3, 2.4, and Corollary 2.3, We start with 
PROOF OF LEMMA 2.3: Write v = u - {yd/a) in (2.20) so that we are looking for the two solutions 
(one positive, one negative) of 

(A.l) <&[v+ (ydla-)] = [yica)<p{v). 

The right side of (A.l) exceeds the left at y = 0. At y = ± [2 log ( Vy/c V2~7r) ] 1/2 , we have unity for the 
right side, which exceeds the left side. Since <P (x) is monotone increasing, and <p (x) increases for x < 0, 
and decreases for x > 0, (2.23) is proved. Noting that as 6 increases 4>[f + (yd/cr)] increases mono- 
tonely to unity for each v, gives the monotonicity and limit results for f,(y, 6). The monotonicity of 
ui(y, 6) follows from the defining relation (2.20). 

If Ui{y, d) > 0, then 4>(w,) > 1 — (<p(ui)lu,i), so that for v satisfying (A.l) 

(A.2) (y/co-Mi;) > 1 - (?[t;+ (yd/(r)]l[v+ (yd/a)]) 

or rearranging and taking logarithms 

e -l/2[(y0M 2 + 2i>(y0M]-] y 2 



SO 



that 



- log V*n- + log (y/ca) + log 1 1 + (ca/y) [|;+(y g /(r)] } > 



log (y/co- V2^) + («r/y) [t;+(y9M] > , 



and since \/A + fi < V4 + VB, 

„-(l/4)[(y#/ < 7)2 + 2iXye/o-)] 

(A.3) [2 log (y/orVS)]* + (2co-/y)" 2 [|>+ (yg/<r)]lft > | V |. 

Because f,(y, 0) has a limit, for d > 9* (say), ^(y, 0) > (— k) (specified k) (and [(yd/a) — 2k] > 1) 
so that 



500 



S. BLUMENTHAL 



[2 log (y/co-V2^-)] 1/2 + (2co7y) 1/2 e-" 4 <^> > \v\ 



which proves (2.24). From (2.20), we have 



<S>(u(y)) -<D(u(l)) = (llc<r)(y<p(u(y)) - <p(u(l)) > 0. 



By Taylor's expansion 



where 



so that since 



and 



(«(y) -«(1)) = (l/ar)(^(u(l))/^(u))[y(*»(tt(y))/#»(a(l))) - 1], 



u(l) *£ u *£ «(y), 



M«(l))/y?( tt )) < (v(«(D)/y«p("(y)))<i, 



(A.4) 



Mu(y))Ap("(D)) < 1, 



(u(y)-u(l)) < (l/co-)y[y-l] < (1/ccr) (1 + (o- 2 /2r 2 )) (<x 2 /2t 2 ) 



using Vl + x <1 + (ac/2). When t 5= 1, (2.25) follows from (A.4). This completes the proof of the lemma. 
Next, we give 

PROOF OF LEMMA 2.4: We break the difference 



(A.5) 



r 2 | o X [R(0, 1") - R(d, I')]g T (0)d6 



into four pieces which will be bounded separately. First, consider the contribution due to noncoverage 
probabilities, namely 

(A.6) r 2 !* {[<P(yu 2 (y, 6) - (0/cr)) - 4>(yi*,(y, 0) - (0/tr))] 
Jo 

- [<P(M1, 0) - (0/o-)) -<*>(«!(!, 0) - (0l<r))]}g T (6)d0. 



Using the mean value theorem, we write the { } term in (A.6) as 
(A.7) (y-l)G(y*, 6) 

where 

'dujjy, d) 



(A.8) c(y*, e) = 2 (-i)V(r*"f(y*.»)-(»M) 

and 



y 



dy 



* +u,-(y*, o) 



1 s£y*s£y. 
Differentiating (2.20) with respect to y, writing u for u(y, 0), and simplifying, we obtain 

<p(u)(duldy) = (llca){(<p(u- (ydla)) -y[(du/dy)- (6I<t)][u- (dla)]<p(u- (6l<r))}, 



INTERVAL ESTIMATION OF NORMAL MEANS 



501 



and using (2.20) again, 

<p(u) {duldy)=<t>(u){{lly) ~ [(du/dy) - (0/<r)] [u- (Ola)]} 
so that 

(A.9) (duldy) = {(lly) + (0la)(u-(0l(T))}l{(<p(u)l<i>(u))+{u-(6l<r))}. 

Note that r 2 (y-l) < (o- 2 /2), 

so that we want to show the boundedness of 



/> 



(6)G(y*,0)dd 



(A. 10) 

and we have 

(A.11) G(y*, 0) = £ (-iy<p(y*in(y*,d)-(dl*)) 



{l + («,y*, 



y(M,-(y*. 0)) 
°' <t>(ui(y*,e)) 



m(y*,d)-y' 



\ui(y*,d)+y*£} 



- <p(ui(y*,d)) 

M»i(y*,o)) 



m(y*,e)-y* 



The denominator in (A. 11) is of order (1/ | u | 3 ) as u— > — <*>, so that when the <p(y*u— (0/cr)) 
term is taken into account, we see that G(y*,»TjT) is bounded as long as each u,(y*, 6) is bounded 
above. This is the case if 6 is bounded above, say by A. We next note from Lemma 2.3 that the denomi- 
nator in (A.ll) has a limit as 6— > °° and is bounded away from zero for 6 > A. Since (u<p(u)l<t>(u)) —*■ 0, 
as u— »• oo and we can write [u+ (y*6/(r)]— [u— (y*0/<r)) +2y*0/cr], where (« — y*0/cr)) is bounded 
in the region of interest, we see that for 6 > A, G(y*, 6) is composed of some terms which are uniformly 
bounded plus a term of the form 

<p(u 2 ) 



HfwK 7 ** 1 "! 



cp(u 2 ) r \ 



+ (a 1 -y*^)(u 2 -y*^)[<,(y*a 2 -^)-^(y*u 1 -^)]} 



{[(* 



XT) <P(Ui)j 



u-z — y 



6\ tpjwi) 



(j 



<P(u 2 ) 



where the arguments of U\ and u 2 are suppressed to save space. By Lemma 2.3, for large enough 
Ui > d — k, so that %>(ui) — » as 6 increases and the first two terms in (A. 12) are uniformly bounded. 
Since («i — y * 0/cr) has a limit by Lemma 2.3 as 6 increases, our attention can be limited to 

(A.13) f * 8gr(0)L(y*u 2 -^-v(y*u i -^ye=f~ (|)g r (0){r|^(y*u 2 -£) -<p(y*u, ~)]}dB 

which we now examine. Write the { } term as 

(A.14) <py(*u,--) T {e-« t »'*(«t-«i)tT*(«i+«iM««r)]_i}. 



502 S. BLUMENTHAL 

Using Vi denned by (A.l) and the definition (2.5), the exponent in (A.14) becomes 

(A.15) ~\ (u 2 -u l )y*[(2do-lr^)+y*(v 1 + v 2 )], 

where t < t* < °°. By (2.24), we may take A sufficiently large so that 



(A.16) \v x + v 2 \<2 V(2co7y*) e'^* e '^ ) <2 V^ e' 8 ^ 

and 

(A.17) |u 2 -« 1 |<4[21og(y*/co- V2^r)] 1/2 . 

If (A.15) is negative, then (A.14) is negative, and (A.13) is bounded above by 0. Consider the set of 6 
for which (A.15) is positive, i.e., (v x +v 2 ) < and 0< (20ct/t* 2 ) < — (vi + v 2 ), so that by (A.16) and 
(A.17), (A.15) is less than 

(A. 18) 4[4co- log (y*/co- V2tt) ] 1/2 y**e-^. 

Assume t large enough so that y* < y < 2, then it is clear for A large enough, there is a constant D 
say so that (A.14) is less than 

(A.19) DTe- e l**<D{Tie), 

so that (A.13) is bounded by 

(A.20) (* Dg T (0)dd<D. 

Putting together the above pieces, we obtain the boundedness of (A.6). Next, we must examine the con- 
tributions to (A.5) due to the interval lengths. The first two terms of (2.19) represent 

(l/y 2 o-V2^) ye *"* {y dy, 

J -oc 

which contributes to r(r, I 7 ), 



(A.21) 



1 ) 


! X e 2 ' 2 d0 
Jo 


£j*V- ( -""<fr 


TTcry 2 T J 




-m 


ye 2<r2 dy e 



e L2t2 2<ji J dd 



0-7T77 



INTERVAL ESTIMATION OF NORMAL MEANS 

ytt(y) 



503 



= (TV^/y 2 V^) <t>(yu(y))- 



<l» 



Vi + (o--7t 2 ] 



Vi + (o-7t j ) 



y'wMy) 



e 2(1 + (t2/(t2)) 



The corresponding term for r(r, l p ) is given by replacing y by 1 in (A.21). When the two terms are 
differenced, and (2.25) is used, the contribution to (2.26) will be found by straightforward computations 
to be 0(1/t). Next, we examine the term 

(2V2o-/y) [" [-logc V^t (aly)<t>(y-<(y+ (8la)))yi><p(y)dy, 

which when averaged w.r.t. g T {6) becomes by changing the order of integration 

(A.22) (4o-/Vtt(1+ (t 2 /o- 2 ))) J x [-y~ 2 log c \f2^{crly)^{xly)y i2 ^{xly)e-^k^)dx. 

A similar expression for the term in t(t, I p ) replaces y by 1 in the square root term, but is otherwise the 
same. The contribution to (A.5) is then, omitting constants, writing (1 + (t 2 /ct 2 )) as y 2 /(y 2 — 1), and 
writing/) for (cV27rcr), 



(A.23) (T 2 (V7=T)/y) f* {[log (l/£><P(*)r/ 2 -(l/y)[log ( y ID® (x/y )]' ' 2 } -4>(*/y)< 

J — oo 



{jcHy2-l)l->y*\(ij- 



Rewrite the <$>(x/y){ } term as 

[-logZ)cpU)] I / 2 [cp(x/y)-cpU)] + {(p(x)[-logZ)cp(x)] 1 / 2 -(l/y)(P(x/y)[-log(D/y)cD(x/y)] 1/2 }, 

and use the mean value theorem to write it as 



(y — !)//(*, y*), where 1 < y* < y, 



and 



H(x, y*) = (l/y* 2 ){-^(x/y*)[-log£>(p(x)] 1 / 2 

+ [-\og(Dly*)<S>(xly*)Y l2 -{<l>(xly*) + (xly*)*p(xly*) 

+ (l/2)[-log(D/y*)$U/y*)] 1 [(x/y*)-^U/y*)- ( PU/y*)]}}. 

It is easily seen that if y =£ 2 (say), using the tail approximations for the normal distribution, that H{x, y*) 
is bounded uniformly in x and 1 =£ y* =S 2. Since T 2 (y — 1) is bounded also, we have for some constant 
J, that (A.23) is less than 



7((Vy 



^TT/y) j' 



e -[*Hv*-my*\dx= V2^7. 



The remaining term in (2.19) contributes to r(r, V) a constant times the integrand of (A.22) inte- 
grated over the range (— °°, yu(y)). Break the range into (— °°, u{\)) and (u(l), yu(y)). Over the 



504 S. BLUMENTHAL 

range (— °°, u(l)), look at the difference written as in (A.23) and it is seen that the same bound applies. 
What remains is the integral over (u(l), y{u(y))). In this range, the integrand of (A.22) is bounded 
uniformly and by (2.25), the length of the range is 0(t~ 2 ). This completes the proof of Lemma 2.4. Next 
we prove Corollary 2.3. 

PROOF OF COROLLARY 2.3: To evaluate the limit of (A.5) as t 2 ^°°, we use the first terms of 
Taylor expansions in place of mean value theorem approximations, i.e., use 

<p(ui(l,d)) / 0M-' 

+ Ml, 0) 



l*(m(l,0)) V v • ' o- 

and 

H(x, I) = [-\og D<S>(x)yi 2 {$>(x) +(H2)[-\og D<l>(x)]- 1 (x<p{x) -<P(x))} 

in place of G(y*, 6) and H(x, y*), respectively. In (A. 10) substitute = 777 and consider the limit, 
HmG(l, tjt)=2co-{[-2 log ccrV2^] 1/2 + [-2 log ccrV2^]- 1/2 }. 

Noting that r 2 (y— l)^'(o- 2 /2), we have for the limit of (A. 6), 

(A.24) ca 3 {[-2 log co-V2ir"]" 2 + [-2 log caV^]- 1 ' 1 }. 



In (A.23), usez = %((Vy 2 -l)/y) and 



< A - 25) "S^v^fM 



W \_ [ [-log co-V^] 1 ' 2 - (1/2) [-log co- V2^]-'/ 2 ifz>0 



v-i V Vy 2 - 1 / 10 ifz<0 

to obtain a limit for (A.23) of 

(A.26) V2o- 3 {[- log co-V2^] 1 / 2 -(l/2)[- log ccr V2^r]- 1/2 }. 

For the difference over the range (— °°, "(1)), note that after transformation, the range is 

(-00, B (l)[(Vy r =T)/y] 

which approaches (— °°, 0) and noting (A. 25), this contributes nothing. The integral over («(1), y"(y)) 
is easily seen to be 0(t' 3 ) and contributes negligibly. The net result is that (after multiplying (A.26) 
by c) we have (2.27) and the proof of Corollary 2.3 is complete. 

REFERENCES 

[1] Aitchison, J. and I. R. Dunsmore, "Linear-Loss Interval Estimation of Location and Scale Pa- 
rameters," Biometrika, 55, 141-148 (1968). 

[2] Bartholomew, D. J., "Contribution to the Discussion of the Paper by C. M. Stein, 'Confidence Sets 
for the mean of a Multivariate Normal Distribution,' " J. Roy Statist. Soc. B, 24, 291-292 (1962). 



INTERVAL ESTIMATION OF NORMAL MEANS 505 

Bartholomew, D. J., "A comparison of Some Bayesian and Frequentist Inferences," Biometrika, 
52, 19-35 (1965). 

Bickel, P. J. and J. A. Yahav, "Asymptotically Pointwise Optimal Procedures in Sequential 
Analysis," Proc. Fifth Berk. Symp. Math. Statist. Prob. (Univ. of California Press, 1965). 
Bickel, P. J. and J. A. Yahav, "Some Contributions to the Asymptotic Theory of Bayes Solutions." 
Stanford University Dept. of Statistics, Technical Rept. No. 24 (1967). 

Bickel, P. J. and J. A. Yahav, "Asymptotically Optimal Bayes and Minimax Procedures in Se- 
quential Estimation," Ann. Math. Statist. 39, 442-456 (1968). 

Blumenthal, S. and A. Cohen, "Estimation of the Larger Translation Parameter," Ann. Math. 
Statist. 39, 502-516 (1968). 

Dudewicz, E. J., "Multiple-Decision (Ranking and Selection) Procedures: Estimation," Un- 
published Thesis, Cornell University (1969). 

Farrell, R., "Estimators of a Location Parameter in the Absolutely Continuous Case," Ann. 
Math. Statist. 35, 949-998 (1964). 

Feller, W., An Introduction to Probability Theory and Its Applications (J. Wiley, New York, 
1957), Vol. 1, 2nd ed. 

Ferguson, T. S., Mathematical Statistics, A Decision Theoretic Approach (Academic Press, New 
York, 1967). 

Joshi, V. M., "Admissibility of the Usual Confidence Sets for the Mean of a Univariate or Bivariate 
Normal Population," Ann. Math. Statist. 40, 1042-1067 (1969). 

Katz, M. W., "Admissible and Minimax Estimates of Parameters in Truncated Spaces," Ann. 
Math. Statist. 32, 136-142 (1961). 

Konijn, H., "Some Estimates Which Minimize the Least Upper Bound of a Probability Together 
with the Cost of Observation," Ann. Inst. Statist. Math. 7, 143-158 (1956). 

Konijn, H., "Minimax Interval Estimates with a Shortness Criterion: A New Formulation," 
Ann. Inst. Statist. Math. 15, 79-81 (1964). 

Paulson, E., "Sequential Interval Estimation for the Means of Normal Populations Interval," 
Ann. Math. Statist. 40, 509-516 (1969). 

Sacks, J., Generalized Bayes Solutions in Estimation Problems, Ann. Math. Statist. 34, 751-768 
(1963). 

Stein, C. M., "Confidence Sets for the Mean of a Multivariate Normal Distribution," J. Roy. 
Statist., Soc. B, 24, 265-296 (1962). 



THE VALUE STATEMENT 



M. E. Nightengale 
U.S. Air Force Academy 



ABSTRACT 

Frequently there exists an insufficient amount of actual data upon which to base a 
decision. In this paper a method is presented whereby the subjective opinions of a group of 
qualified persons are utilized to quantify the relative importance of a finite number of param- 
eters or objectives. A means of testing the consistency among the judges is given which 
allows the decision maker to determine the validity of the opinions gathered. 

The application presented here is in the area of Multiple Incentive Contracting. Namely, 
a method is proposed to facilitate the answering of the question, "What is the value to the 
purchaser of an incremental change in the performance of a system?" Such a vehicle is not 
essential to the methodology proposed. 

INTRODUCTION 

A current trend in governmental procurement is the use of a form of systems analysis in structuring 
Multiple Incentive Contracts. The incentives are structured in such a manner that the government will 
obtain a product that results in the minimum total cost for development, procurement, and operation 
of the product in the fulfillment of a defined mission. The scale of payments from the buyer to the 
seller depends upon an explicit statement of future contingencies; that is, the contractor receives 
different rewards for different levels of performance. Where does this explicit statement originate? 
The search for an answer to that question is likely to unearth another question; namely, how was the 
mission defined? What thoughts, judgments, and opinions went into the definition of the mission? 

Notice that the question itself is phrased in subjective terminology. Projections into the future 
must necessarily be based on subjective evaluations rather than on quantitatively derived forecasts. 
Even if there were some mathematical model available, the input assumptions and the interpretation 
of the output are functions of the expertise of the individuals involved. 

Since there is no proper theoretical foundation, and since one must rely to some extent on sub- 
jective evaluations, there are but two options available to the buyer: (1) he can wait until there exist 
sufficiently adequate quantitative methods, or (2) he can make the best of a poor situation and try to 
use the opinions of a group of experts in some reliable and beneficial manner. Thus, one must almost 
inevitably arrive at the decision to use the best data available — subjective evaluations. 

This article will not delve into, nor include a discussion of, incentive contracting and the philosophy 
underlying it. The interested reader is invited to read the Air Force Academy Manual, "The Evaluation 
and Structuring Techniques of Multiple Incentive Contracts." 

Thus far one has seen that the questions of how the rewards are determined and, in fact, how the 
mission itself has been defined have led to the search for a method or methods of handling subjective 
evaluations. Methods for quantifying such evaluations exist in the literature today. One of the major 
problems in structuring incentive contracts has been that the proposed "solution" commences with the 

507 



508 



M. E. NIGHTENGALE 



assumption that the fundamental problem has been solved! That is to say, a curve has been drawn, 
an equation postulated or some such scheme, and what remains to be accomplished is to manipulate 
the tangents or derivatives and arrive at the proper conclusion. The basic problem, however, remains 
unsolved. 

The specific statement often referred to as "the value statement" is merely a statement of the 
value to the government of an incremental increase, or decrease, in some parameter. In multiple 
incentive contracting, this "value" is measured in dollars. The problem posed is then one of somehow 
assigning these dollar values. 

Since this is such a specific problem, it will perhaps be of more interest and value to present here 
a method for handling the general case; that is, assigning numerical values to qualitative judgments in 
discrete cases. The application of a general method to the specific problem will be readily apparent. 
First, consider a discrete problem which may be exemplified by having a group of judiciously selected 
experts choose the most desirable parameters to incentivize from among all of the parameters. The 
basis for this method of solution is the Law of Comparative Judgment, described by Thurstone [6] in 
1927. A detailed explanation and illustrative examples of this method may be found in Nightengale 
[4, 5]. An explanation of the computational procedure will be presented here. 

COMPUTATIONAL PROCEDURE 

The computational procedure associated with the law of comparative judgment applied to ranking 
the parameters involves the tabulation of the number of times parameter i is judged more significant 
than parameter j. This tabulation is shown for n parameters in Table I. 

Table I. Matrix A: Number of Times Parameter i Judged More Significant Than Parameter y 



Parameter i 


Parameter./ 


1 


2 


3 . . . 


j . . . 


n 


1 





a r > 


Ol3 


a,j 


a, n 


2 


a-n 





a 23 


a-,j 


ai„ 


3 


an 


a 32 





a 3 j 


a 3 „ 














i 


a n 


a,i 


a« 


a,j 


a,„ 














n 


a„\ 


a, ,1 


<?,!.! 


a„j 






a,j= the number of times parameter i is judged to be more significant than parameter j, i +j. 



It should be noted that ties or indifference judgments are not allowed. This is consistent with the 
assumptions involved in Thurstone's Case Five [6, p. 45], upon which this procedure is based. Although 
the computations introduced by allowing ties could be easily handled, the law of comparative judgment 
provides for only two choices. One should also be aware that if the number of parameters to be judged 
is too great, fatigue and indifference will become important factors. The main diagonal of Matrix A 



THE VALUE STATEMENT 



509 



will be left blank because a parameter is not judged against itself. The total number of judgments 
required to obtain the matrix is n(n— l)/2. 

Matrix P is constructed from Matrix A and is used to display the percentile of times of the total 
judgments that parameter i was judged more significant than parameter j (see Table II). Once again 
the diagonal will contain zeros. The elements of the P Matrix have the property that p,j + Pji= 1. 

TABLE II. Matrix P: Percentile of Times Parameter i Judged More Significant Than Parameter) 



Parameter i 


Parameter j 


Row totals 


1 


2 


3 . . . 


j ■ ■ ■ 


n 


1 





PV2 


Pia 


Pu 


P\n 


Pi 


2 


P21 





P-23 


Pi} 


P2„ 


Pi 


3 


P31 


P32 





P3J 


Pan 


P3 
















i 


Pn 


Pi2 


Pa 


Pa 


Pin 


Pi 
















n 


P,n 


P„l 


P„3 


Pnj 





Pn 



Po = the percentile of times parameter i is judged to be more significant than parameter./, i =t=y. 
Pij=(iijlc, where c is the number of judges. 

Matrix Z is used to convert Matrix P into standard measurements of separation in terms of the 
equal standard deviations of the discriminal dispersion scale (see Table III). 



Table III. Matrix Z: Basic Transformation Matrix 



Parameter i 


Parameter/ 


Total 


Mean 


1 


2 


3 . . . 


j . . . 


n 


1 





Zl2 


213 


Zij 


Z\n 


z, 


z, 


2 


221 





223 


*2j 


Ztn 


z 2 


Zi 


3 


Zg] 


Z32 





Z:ij 


Zin 


2*3 


z 3 


















i 


Zil 


ZI2 


Zi3 


Zij 


Zin 


z, 


z, 


















n 


2,il 


Zni 


Zn3 


Ziij 





z„ 


z„ 



Zij= the separation between parameter i and parameter/ in terms of standard deviations, i =¥j. 

Zl . = 2zi ,z,=— • 



510 M. E. NIGHTENGALE 

The element zy is the standardized normal deviate corresponding to the element p y . The element 
Zij will be positive for all values of py greater than 0.50 and negative for all values of py less than 0.50. 
Zeros are entered in the diagonal cells. The matrix is skew-symmetric; that is,2 2 i = — zi2- That this 
is necessarily so may be shown as follows: 

Py = G(z y )= P' j (27r)-" 2 exp {-zV2}dz, 

J -30 

where zy is the unknown standardized normal varia'te. Since py = 1 —pj>, then 
\ Z ' J (27r)-'/ 2 exp {-z-/2}dz=l-( Zjl (27r)-" 2 exp {-z 2 l2}dz 

J —x J -x 

= f* (2tt)"»/ 2 exp {-z 2 l2}dz = [ *" (2tt)-^ exp {-z 2 /2}dz. 

JZji J- 30 

Hence, zy=— Zjj. 

Matrix Z contains the sample estimates zy of the theoretical values found in the equation of the law 
of comparative judgment. The element zy is an estimate then of the difference (Sj — S,) between scale 
values of the two stimuli measured in units of the standard deviation of the distribution of discriminal 
differences. Each independent element of Matrix Z is an estimate of a value for one equation of the law 
of comparative judgment. Z is the average difference between the scale value of the tth stimulus and 
the scale value of each of the stimuli with which it was compared. Because of the manner in which the scale 
was originally constructed, the Z, values may then be transformed by use of Normal Tables, G(Z,), 
into a percentage of the area under the Normal Curve. The G(Z,-) values may then be normalized and 
used as the decision-maker deems appropriate. 

TESTING THE SCALE VALUES 

The reliability of these subjective estimates of the significance of the parameters is dependent 
upon the internal consistency of the data. A test exists for checking this internal consistency and con- 
sists of starting with the final scale values and working backward to the theoretical percentages de- 
manded by them. The discrepancies, designated by Ay, are compared with the mean of the absolute 
values of all of the discrepancies. This, then, is assumed to be the deviation about the expected py. The 
average deviation is computed by: 

VIA I 
Average Deviation = — tt^K where N = number of discrepancy calculations. 

The average deviation has several advantages: it is easy to define, easy to compute, includes all the 
variates, and gives the appropriate weight to extreme cases. The disadvantage of using it is its unsuit- 
ability for algebraic manipulations, since signs must be adjusted in its definition. Also, if the means and 
mean deviations of two sets of variates are known, there is no ready method for calculating the mean 
deviation of the combined set without going back to the original data. 

At this point in the analysis, a test for the internal consistency of the judges' estimates has been 
stated. The evaluation of the inconsistency calls for a judgment on the part of the analyst as to the sever- 
ity of the inconsistency. The possible consequences in terms of human lives, dollars, or whatever units 
of value are involved plays a most important role in this judgment. If there are significant discrepancies, 
the procedure should be reevaluated to determine the possible causes and possibly point out new or 
different means of gathering data. There are several avenues that can be pursued from this point: the 
judges can be asked to once again rank the possible parameters; a new group of judges can be consulted 
to do the ranking; or, at the worst, the entire analysis can be discarded with the decision maker being 



THE VALUE STATEMENT 



511 



left no worse off than he was when he was operating under complete uncertainty. This assumes that the 
time and money invested are not significant. 

If, on the other hand, there are no significant discrepancies, a tool exists for the assignment of 
significance to the parameters. 

HYPOTHETICAL EXAMPLE 

As a simple hypothetical example, consider an airplane which the government desires to have 
produced. Also, it has been decided that an incentive-type contract will be negotiated with the supplier. 
However, exactly which parameters will have incentives placed upon them and the relative importance 
of each parameter have not yet been determined. 

It is necessary to determine the significance of each parameter. Ten individuals are available 
who qualify as experts based on their past experiences, etc. 

The parameters are as follows: 

S I Airspeed. C III Combat Ceiling. 

R II Combat Range. P IV Payload carrying capability. 

The ten people were asked to rank the four parameters subjectively in order of significance. 
The results are shown in Table IV, where a rank of 1 is best. Matrix A of the computational procedure 
for this example is shown in Table V. 

Table IV. Rank Orderings Submitted by the Judges 







Parameters 




Judges 


















S I 


R II 


C III 


P IV 


1 


3 


2 


1 


4 


2 


1 


2 


3 


4 


3 


3 


1 


2 


4 


4 


1 


2 


3 


4 


5 


3 


1 


2 


4 


6 


3 


1 


2 


4 


7 


3 


2 


4 


1 


8 


3 


4 


1 


2 


9 


2 


4 


1 


3 


10 


2 


1 


3 


4 


I Rank 


24 


20 


22 


34 


Average Rank 


2.4 


2.0 


2.2 


3.4 



Table V. Matrix A: Number of Times Parameter i Judged More Significant Than Parameter y 



Parameter i 


Parameter,; 


S I 


R II 


C III 


P IV 


S I 


X 


4 


4 


8 


R II 


6 


X 


7 


7 


C III 


6 


3 


X 


9 


P IV 


2 


3 


1 


X 



512 M. E. NIGHTENGALE 

For example, Judge No. 7 ranked the parameters as to their significance in the following order: 

P IV 

R II 

S I 

C III 

Matrix P and Matrix Z for this example are given in Tables VI and VII. 
TABLE VI. Matrix P: Percentage of Times Parameter i Judged More Significant Than Parameter j 



Parameter ; 


Parameter j 


Row totals 


S I 


R II 


C III 


P IV 


S 1 


X 


0.400 


0.400 


0.800 


1.600 


R II 


0.600 


X 


0.700 


0.700 


2.000 


C III 


0.600 


0.300 


X 


0.900 


1.800 


P IV 


0.200 


0.300 


0.100 


X 


0.600 



Table VII. Matrix Z: Basic Transformation Matrix 



Parameter i 


Parameter j 


Total 


Mean (Z) 


S I 


R II 


C III 


P IV 


S I 





-0.25334 


-0.25334 


0.84161 


0.33493 


0.08373 


R II 


0.25334 





0.52441 


0.52441 


1.30216 


0.32554 


C III 


0.25334 


-0.52441 





1.28155 


1.01048 


0.25262 


P IV 


-0.84161 


-0.52441 


-1.28155 





- 2.64757 


-0.66189 



The necessary values to compute the relative significance are shown in Table VIII. 



Table VIII. Calculation of Relative Significance 



Parameter i 


z 


C(Z) 


Normalized 

relative 
Significance 


S 1 


0.08373 


0.53356 


0.2647 


R 11 


0.32554 


0.62761 


0.3115 


C III 


0.25262 


0.59972 


0.2977 


P IV 


-0.66189 


0.25403 


0.1261 



In order to check the internal consistency of the judges, it is necessary to begin with the final scale 
values and reverse the computational procedure. The difference represented by (Z-,—Zj) is transformed 



THE VALUE STATEMENT 



513 



by using the Normal Tables into G(z), which is the percentage of times parameter i theoretically should 
have been judged to be more significant than parameter;'. 

Table IX. Consistency Check 



Z| — Z|| 


= 0.08373- 


0.32554 = 


Zi-Zj 


P (computed) 


-0.24181 


0.40447 


Z\ — Z|n 


= 0.08373 - 


0.25262 = 


-0.16889 


0.43295 


Z| — Z| V 


= 0.08373- 


(-0.66189) = 


0.74562 


0.77205 


Z\\ — Zin 


= 0.32554- 


0.25262 = 


0.07292 


0.52906 


Z\\ ~z n 


= 0.32554- 


(-0.66189) = 


0.98743 


0.83828 


Z»i~Z iv 


= 0.25262- 


(-0.66189) = 


0.91451 


0.81977 



The difference between the computed percentage of times parameter i should have been judged 
more significant than parameter j and the observed percentage of times parameter i was judged to 
be more significant than parameter j is designated as A,j. Each Ay is compared to the mean of the 
absolute values of all of the differences. This is assumed to be the deviation about the expected Pij. 



TABLE X. Deviation Calculations 



AI-II 


= 0.40447-0.400 = 


0.00447 


A I- HI 


= 0.43295-0.400 = 


0.03295 


AI-IV 


= 0.77205-0.800 = - 


-0.02795 


All— HI 


= 0.52906-0.700 = - 


-0.17094 


AII-IV 


= 0.83828-0.700 = 


0.13828 


A1II-IV 


= 0.81977-0.900 = - 


- 0.08023 




S|A«| = 


0.45482 




Average deviation = 


2|A,,| 
N 




= 


0.45482 
6 




= 


0.07580 



The largest absolute discrepancy between observed and calculated values is 0.17094. This dis- 
crepancy is less than three average deviations which indicates that the data is reliable; that is, the 
judges were consistent among themselves and the assignment of the values was not unjustified. 



SUMMARY 

The suggested technique presents the decision maker with yet another tool to be utilized in his 
quest for assistance. A decision maker with an important decision to make under conditions of un- 
certainty would surely welcome any assistance. It has been demonstrated that the method of Rank 
Order can be used to change a decision under uncertainty to a decision under risk; that is, from a 
decision with no probabilities known to one where some measure of the probabilities involved has been 
determined. Morris [3] has a very good coverage of the principles of choice which may be applied to 
such decisions. 

In cases which qualify as decisions under uncertainty, this method can be applied. If there is 
better information available it should be used, but in any case, all the information should be considered. 
The use of subjectivities such as the above is discussed extensively in Edwards [1] and in Fishburn [2]. 



514 M. E. NIGHTENGALE 

There are several facets of this technique that deserve further consideration. Perhaps the distri- 
bution assumed should be one other than the Normal; perhaps a £ -distribution would be more appro- 
priate. This would have to be examined closely by the analyst. In the development of PERT a weighting 
factor obtained from the Beta distribution was applied to the estimates because the judges consistently 
estimated too low. Maybe a factor of this type would be applicable here. 

A record should be maintained, both individually and collectively, of the judges' predictions and 
the subsequent results. Then a weighting factor could be introduced later in future judgments based 
on past data. 

Undoubtedly, there are many possible modifications to the technique presented in this section. 
Further study may prove very rewarding since the use of subjective judgments in aiding decision making 
is quite an untapped, though potentially rewarding, field. 

REFERENCES 

[1] Edwards, W., "The Theory of Decision Making," Psychological Bulletin 51, 380-417 (1954). 

[2] Fishburn, P. C, Decision and Value Theory (John Wiley and Sons, Inc., New York, 1964). 

[3] Morris, W. T., Engineering Economy (Richard D. Irwin, Inc., Homewood, Illinois, 1960). 

[4] Nightengale, M. E., "An Approach to Decisions Under Uncertainty," Arizona State University 

Industrial Engineering Research Bulletin No. 1 (1965). 
[5] Nightengale, M. E., "A Systems Approach to Strategy Formulation When Decisions Must be Made 

Under Conditions of Uncertainty," Ph. D. Dissertation, Arizona State University, 1966. 
[6] Thurstone, L. L., The Measurement of Values (University of Chicago Press, Chicago, 1959V 



A QUEUEING PROCESS WITH VARYING DEGREE OF SERVICE 



Meckinley Scott 

Mathematics Department 
University of Alabama 



ABSTRACT 

This article describes and presents some results of the analysis of a queueing process 
where the degree of service given to a customer is subject to control by the management. 
The control doctrine used is based on the queue length and also on the recent history of 
the system. 

In some queueing situations the total service given to a customer may be considered as consisting 
of two or more stages. In fact, in any queueing process for which we can assume that the service time 
has an Erlang distribution of order k, then the servicing operation may be regarded as consisting 
of A: stages (regardless of whether or not these stages have a physical meaning). When stage 1 of service 
is complete the customer passes immediately to stage 2 and so on. There are situations in which it is 
necessary to distinguish between the various stages of service, for example, (i) those which are es- 
sential and (ii) those which are not. In such a case, from the management point of view, it may be desir- 
able to dispense with the non-essential stages of service whenever there is a large accumulation of 
customers in the queue.* Thus, the degree (extent) of service given to a customer is no longer fixed, 
but a variable subject to control by the management. Since these types of problems have not been dis- 
cussed anywhere in literature, the purpose of this note is to introduce and present some results of the 
analysis of a simple queueing model of the above character. 

DESCRIPTION OF THE MODEL 

Consider a queueing process in which the total service is completed in two stages, one to be 
followed immediately by the other. Starting from an instant in which the system is empty, the queue 
length increases to a certain prescribed number/? for the first time. At this point, the mechanism for the 
second stage of service is shut off until the queue length drops to a value r(0 =£ r ^ /? — 1), at which 
time both stages of service again become available. When the queue length increases again to R 
(irrespective of whether or not, in the mean time, the system became empty) the same process is then 
repeated. This policy does not, however, affect a customer whose second stage of service has already 
begun. That is, unavailability of the second stage is to be understood in the sense that service of a 
customer already in the second stage of service remains uninterrupted until completed regardless 
of the number of arrivals for the duration of his service time. Furthermore, it is assumed that unfortunate 
customers who receive only the first stage of service do not return later for the second stage. 

The above type of control on the service mechanism gives rise to the phenomenon of hysteresis. 
The availability of the total service is no longer uniquely determined by the realized queue length. 



k In this study the phrase "customers in the queue" is used in the sense that the one being served is also included. 

515 



516 M. SCOTT 

Rather the availability of the total service depends on the previous history of the system, that is, 
on the path by which it arrived at the present state. 

The waiting room is assumed to have a capacity N(N 2= R) , and the queue discipline is first-come, 
first-served. t Customers arrive into the system according to the Poisson law with intensity k. The 
lengths of service for the first and second stage are independent random variables each following 
exponential distribution with mean durations (llp-i) and U/^2) respectively, and are independent 
of the arrival process. 

CHARACTERIZATION OF THE STATES IN THE QUEUE 

Associated with the service availability at time, t, is an indicator variable, Zt, which assumes say, 
a value 1 when the second stage of service is available and a value when it is not. Let Xt denote the 
queue size at time t; then from the description of the model above, it is clear that if X t ==£ r then Z(= 1 
and if X, 3=/? then Z, = 0, but if r <X t < R then Z t may be either 1 or 0. Also, when at time, t, there 
is at least one customer in the queue, let Y t be another indicator variable which assumes say, a value 
1 when the customer being served is in the first stage of service and a value 2 when he is in the second 
stage. Let Eo and E\ denote, respectively, the sets of states in which there is no customer and at least 
one customer in the queue. A typical state of the set E\ may be characterized by means of a triplet 
(n, i, j) where n refers to the queue size and i and 7 the values of the indicator variables, Y t and Z t , 
respectively. Of interest, are four mutually exclusive and exhaustive subsets of E x given by, say 

Ai-.{(n, i, 1); l^n^R-l}, (i = l, 2), 
fii:{(», 1,0); r+l^n^N}, 
B 2 :{(n, 2,0);R^n^N}. 

Let A— A 1 U A 2 and B = B\ U fi 2 • Then A describes the set of states where the second stage of service 
is available and B is the set of states in which it is not. As explained earlier, note that unavailability 
of the second stage of service does not affect a customer whose second stage of service has already 
begun. The states of the set Z? 2 describe this situation. The Flow diagram, Figure 1, helps illustrate 
the way in which the queue operates. 

STEADY STATE PROBABILITIES AND GENERATING FUNCTIONS 

Let pi(0) denote the probability that at time t, there is no customer in the queue. Also, let pt(n, i) 
and qt(n, i), (i= 1, 2) denote the probabilities that at time t, there are n customers in the queue and 
that the system is in set A\ and set B\, respectively. That is, 

p t (n, i) = Pr \X t = n, Y, = i,Z,= l], (i=l,*2; » = 1, 2, . . .,R-1), 
q,(n, l) = Pr [X t = n, F,= 1,Z, = 0], (n=r+l,r + 2, . . .,N), 
qt(n, 2) = Pr [X t = n, Y, = 2,Z, = 0], (n = R,R + l, . . .,N). 

The steady-state probabilities corresponding to Pt(0), pi(n, i) , and qt(n, i) will be denoted by p(0), 



tThe mathematical development of the process in this study does not specifically require the assumption of first-come, 
first-served queue discipline. 



QUEUEING WITH VARYING DEGREE OE SERVICE 



517 




(r* 1,1,0) 
(r + 2,1,0 



N-1,1,0) L^vlfN-IAO) 

(N.1,0) J ^4(N.2J0) 



Figure 1. Flow diagram showing queue operation (in the triplet (n, i,j), n refers to the queue size and i andj to the values of 

the indicator variables Y, and Z,, respectively) 



p(n, i) , and q(n, i) , respectively. The steady-state equations of the system can easily be derived in 
the usual manner and, therefore, will not be presented here. It can be shown that equations pertaining 
to sets B\ and Bi yield the following solutions: 

(1) q(n,l)= )~ P " [ q(r +1,1), (n = r+l, r+2, . . ., R-l), 

(1— Pi) 

(2) q(R, D = (1 ( ~_ P ' j- q{r+l,l)- Pl (l-a)p(R-l,2), 

(l-p?- r ) 

(3) q(n, 1) =p»- H V, Hl / q(r+l, 1) 



(1-Pi) 



+ 



(a-pi) 



[a"-' t + i -p"r' i (a-p i + ap l )]p{R-l,2), (n = R + l,R + 2, . . .,N-1) 



(4) 

(5) 
(6) 



q(N,l)=p»-" {1 n P " ' } g(r+l, 1)+ PT [aV-« + '-p i v-«-.(a-p, + ap 1 )]p(«-l,2), 
(1— Pi) (a - Pi) 

9 (n, 2) = a"-* +, p(/?-l, 2), (n = /?,/? + l, . . ., iV— 1) 

q(N,2)=p i a*-' i p(R-l,2), 



where 

(7) 

(8) 



p, = X/p, ^ 1, 
p> = klp-,, 



518 M. SCOTT 

(9) a = k/{k + p 2 ) *p u 

Throughout the rest of this paper it will be assumed that p\ ¥=■ 1 and a ¥" p\. The cases where pi = 1 
and a = p\ can be obtained as limiting cases or by solving the equations directly. Now introduce the 
following generating functions: 

(10) P(u, i)= £ u*p(n, i), (i=l,2). 

n=l 

Explicit solutions of the equations pertaining to sets A\ and A 2 seem very difficult to obtain, but it can 
be shown that 

(id P( "< 0= ^Aur (i=1 ' 2) ' 

where 

A(u, 1) = (ku-k- p*)[ku(l- u)p(0) + ku H+l p(R-l, l)-fi!U r+1 q(r+l, l)]-\fizU R p(R-l, 2), 

(12) 

b(u,2) = k(ku-k-p i )u K+1 p(R-l,2)-p l [ku(l-u)p(0) + ku R+i p(R-l,\)-piU l + l q(r+l,l)], 
(13) 

and 

(14) A(u) = \ 2 (u-l)(u-U!)(u-u 2 ), 
where 

(15) ut= [(\ + p, + p 2 )T V(A + p 1 + p 2 ) 2 -4p,p 2 ]/2X, (£ = 1,2), 

with U\ having the negative sign in front of the radical. 

The probabilities of the sets A x and A 2 can be obtained by considering Eqs. 11 through 15, and 
are given by 

(16) P(A 1 )=—^—j [p.p(0)+p,(l-p 2 )p(/?-l,2)-(/?-rMr+l, 1)] 
and 

(17) P(A 2 )=j^-^[p 2 p(0)+p 1 p 2 p(R-l,2)-(R-rHp*lpi)q(r+l, 1)], 

respectively, where 

p = pi + p 2 ¥■ 1. . 

The probabilities of the sets B\ and B 2 can be obtained from Eqs. 1 through 9 and are, respectively, 
given by 

(19) P(B 1 ) = [(R-r)(l-p 1 )-pt R+1 (l-pf- r )] g n + 1 \V 



[ (P.-P2) ,._„„ (a-pi + qpi) ._„„ (l-p 2 ) ] 
+ Pl l(a- Pl ) a + (l-p,)(a-p,) Pl (l-p.)J 



QUEUEING WITH VARYING DEGREE OF SERVICE 



519 



and 

(20) P(B 2 )=p,p(R- 1,2). 

It can also be shown that the expected queue length L say, is given by 

(21) L = L(A)+L(B), 
where 

1 



L(A) = 

(22) 

and 
L(B) = 



'23) 



d-p) 2 



(p-piP2)p(0)- ( * r) {pi + piP* + pi+lp(l-.p)(R + r+l)}q{r+l, 1) 

pi *■ 

+ p 1 {p-p l p 2 +R(l-p)}p(R-l,2) 



±(R-r){R + r+l)+ ^ |(/?-r)-(l- P r')(/V + r ^)p, v - /< } 



g(r+l,l) 



+ 



U-Pi) 

/?p2(a-p 1 + p 1 a 2 )+P!(a-pi + ap.)(l-a v - w )+iVp 1 (p,-p2)a' v - ft + 1 



(1-Pi) 



(a-pi + ap,) R + 



1 



1-Pi 



/V + 



r^)>H 



p(ft-l,2) 
(a-pi) 



In fact, L(^) is the contribution to the expected queue length L from the se\.A=A\ U /4-2 and similar 
remarks apply to L(B). 

So far, all quantities have been expressed in terms oi p(0), p{R — 1, I) , p(R — 1,2) and g(r + 1, 1); 
these can be determined by solving the following set of linear equations: 



(24) 
(25) 



(26) 






d-p) 



P(0) 



kp(R-l, l)+Xp(K-l,2)-p,g(r+l, 1)=0, 



(u ( -l)p(0) + (u[-uf)p(/?-l,l) 



+ «r+ 



p-ip-i 



R- 



' (kuj — k — p-,) 



p(/?-l,2)=0, (i = l, 2), 



p 2 (l-p 2 ) ^ p,(p,-p 2 ) aX _ R+] 



(l-pO(l-p) (a-p,) 

(q-pi + api) «_ fl+2 



P(«-1,2) 



(l-p 1 )(a-p.)" 1 

1 f («-r)p 2 (l-pf-0 ,._,., 



(1-p,) lp,(l-p) + ' (1-p,) 



o(r+l,l) = l. 



Equations 24 and 25 above are derived from the condition that in P(u, 1) the zeros of the numerator 
coincide with the zeros of the denominator* and equation 26 is obtained from the normalizing condition 
for the sum of all probabilities. 



""An equivalent set of equations is obtained by considering t*(u, 2). Equation 24 may also be obtained from equations of the 



set B. 



520 



M. SCOTT 



Tables 1 through 4 give some numerical computations of p(0), P(Ai), P(Bt), (i=l, 2) and L. 
Table 1. yv=10, R = 6, a = 1, /x, = 2, /jl 2 = 3 



r 


P(0) 


P(A t ) 


P(A 2 ) 


P(B>) 


P(Bt) 


L 





0.2469 


0.3806 


0.2524 


0.1188 


0.0013 


2.1038 


1 


0.2361 


0.3968 


0.2631 


0.1026 


0.0014 


2.1510 


2 


0.2265 


0.4113 


0.2727 


0.0881 


0.0015 


2.2282 


3 


0.2178 


0.4244 


0.2812 


0.0749 


0.0017 


2.3339 


4 


0.2095 


0.4368 


0.2891 


0.0624 


0.0021 


2.4736 


5 


0.2012 


0.4494 


0.2967 


0.0497 


0.0029 


2.6667 



Table 2. N= 10, R = 7, X=l, ^ = 2, /a 2 = 3 



r 


P(0) 


PG4.) 


P(A 2 ) 


P(B t ) 


P(B 2 ) 


L 





0.2321 


0.4031 


0.2678 


0.0961 


0.0009 


2.2551 


1 


0.2243 


0.4149 


0.2756 


0.0843 


0.0010 


2.2920 


2 


0.2172 


0.4256 


0.2827 


0.0735 


0.0010 


2.3516 


3 


0.2107 


0.4354 


0.2891 


0.0637 


0.0011 


2.4315 


4 


0.2047 


0.4445 


0.2950 


0.0545 


0.0013 


2.5320 


5 


0.1990 


0.4533 


0.3006 


0.0456 


0.0016 


2.6585 


6 


0.1930 


0.4624 


0.3060 


0.0364 


0.0022 


2.8285 



Table 3. N=12,R = 6, X= 1, ^ = .95, /jl 2 = 10 



r 


P(0) 


P(A t ) 


P(4t) 


P(B t ) 


P(B 2 ) 


L 





0.0450 


0.1308 


0.0124 


0.8118 


0.0000 


6.8329 


1 


0.0425 


0.1619 


0.0154 


0.7802 


0.0000 


6.8613 


2 


0.0403 


0.1923 


0.0182 


0.7492 


0.0000 


6.9021 


3 


0.0384 


0.2221 


0.0211 


0.7184 


0.0000 


6.9533 


4 


0.0367 


0.2517 


0.0239 


0.6877 


0.0000 


7.0153 


5 


0.0353 


0.2810 


0.0266 


0.6570 


0.0001 


7.0921 



Table 4. N 


= 12,/? = 


= 7, k = 


1, /LI, = .95, fJL2 = 10 


T 


p(0) 


P(Ax) 


P(A-i) 


P(Bi) 


P(B-i) 


L 





0.0436 


0.1552 


0.0147 


0.7865 


0.0000 


6.8743 


1 


0.0411 


0.1870 


0.0178 


0.7541 


0.0000 


6.9074 


2 


0.0389 


0.2182 


0.0207 


0.7221 


0.0000 


6.9521 


3 


0.0370 


0.2491 


0.0236 


0.6902 


0.0000 


7.0069 


4 


0.0354 


0.2799 


0.0266 


. 0.6582 


0.0000 


7.0701 


5 


0.0339 


0.3107 


0.0295 


0.6258 


0.0000 


7.1421 


6 


0.0327 


0.3418 


0.0324 


0.5930 


0.0001 


7.2271 



AVAILABILITY OF SECOND STAGE OF SERVICE 

For the queueing process under consideration it is clear that the most interesting characteristics 
are those concerned with the availability or unavailability of the second stage of service. The proba- 
bility of the set B given by 



QUEUEING WITH VARYING DEGREE OF SERVICE 521 

(27) P(B)=P(B i )+P(B 2 ) 

may be interpreted as the fraction of time the second stage of service is unavailable, whereas 

(28) \ - P(B) = P (0) + P(A) 

= p(0)+P(A 1 )+P(A 2 ), 

is the fraction of time it is available. In some instances, it may also be important to know how often the 
mechanism for the second stage of service is switched off or switched on. Let / denote the expected 
frequency per unit time the mechanism is switched off. Under steady-state conditions, / also equals 
the expected frequency per unit time the mechanism is switched on. In fact, Eq. 24 reflects this property 
of the system. We have then 

(29) I^mqir+1, 1) 

= X{p(R-l, 1) +/>(/?- 1,2)}. 

A FIRST-PASSAGE TIME PROBLEM 

This section deals with the distribution of time that elapses before the system enters the set A 
starting initially from one of the states in the set B. Let f n ,i(t) , (i=l,2) denote the probability density 
function (p.d.f.) of the required distribution, given that initially the system was in a state (n, i, 0) of the 
set B. A little reflection shows that//} >2 (0 is identical with the p.d.f. of the time that elapses from the 
instant the system enters the set B from the set A until it reenters the set A (that is, from the instant 
the service mechanism for the second stage is switched off until it is switched on again), given that 
entry into the set B is made through a state (R,2,0). Similar remarks apply to the p.d.f. /r, i (r). A 
little further reflection shows that/„,i(0, (n^r+1) is the same as the distribution of time, in an 
M|M| 1 queue with capacity N, that elapses before the queue size drops to a value r starting initially 
from a value n. If we denote by f*,i(s) for the Laplace transform of the p.d.f. f n , ,-(0, then using the 
same kind of argument as in (4), it can be shown that 

(30) f* tl (s) = kia? + k 2 a%, (n=r+l,r+2, . . .,/V), 

(31) f* 2 (s) = 6, A.ar 1 — tHH + fe«V — rHr^ 

[ 1 — biOCi l — biOCi J 

+ b 3 bg- n {kia*>- 1 + k 2 a$- 1 }, (n=R, R + l, . . .,/V), 
where 

(32) a 1 = —{(k + fji l + s) + V(\ + /Ltl + 5)2-4X^1,}, 

(33) a 2 = ^{(\ + /Lt, + s)- V( \ + M , + s) 2 -4\/li,}, 

(34) b l = f JL>/(k + (JL2 + s), 

(35) b 2 = kl(\ + fjL 2 + s), 

(36) b, = fji 2 l(vL 2 + s ), 



522 M. SCOTT 

c 1 af-'"Ha i -c 3 ) 



(37) kt = 

(38) fa = 



a 2 r+1 { a, v - r - 2 ( 1 - c 2 a 2 ) (o, - c 3 ) + a% ~ r ~ 2 ( 1 - c 2 a, ) (c 3 - a 2 ) } ' 

Cia 2 v '~ r_2 (c 3 — a 2 ) 
aJ +1 {a'J r_r " 2 (l - c 2 « 2 ) (a, - c 3 ) + a£- r - 2 (l - c 2 «i) (c 3 - a 2 )} ' 

(39) Ci = /i 1 /(a.+ ^ + s), 

(40) c 2 = A/(A + ^, + s), 

(41) c 3 = At,/(Ati + 5). 

The expected value E{T„,i} say, corresponding to the p.d.f. f n ,i{t) is given by 

(42) E{T n ,i}= } n ~ r \ - 1 ^-— {(\//x,)^-"-(X/Mi)' v - r }. (n = r,r+l,. . . , /V) , and 

(/Lti — A.) (/Ai — \) 2 

(43) F{r„, 2 } = £{r„_ 1)1 }+- — ^--{(/x,/^) -7^—-^ -(x//x.) v -" +1 ] 

+ ^+Aif k - i](-A_f +1 f (»=*,* + ! N). 

(/Lti — A.) LjLt!(\ + jLl2 — jLti) /Lt-jJ \K + fJ.2/ 

Notice that when A and /Xi are fixed and /i-2 - *■ °°, then £{^,2}^ E{T„^i, 1}, a result which is intui- 
tively obvious. Furthermore, when />ti = /1.2 then ^ITn^} = E{T n , 1}, (n = R,R + 1, . . .,N). 

DISCUSSION 

In this paper, we have presented expressions for several important characteristics of a queueing 
process where the degree of service given to a customer is subject to control by the decision maker. 
A knowledge of these characteristics enables us, in principle, to construct a reasonable objective 
function based on costs originating from three sources: (a) a cost proportional to the queueing times of 
customers, (b) costs associated with the two stages of service, and (c) costs associated with switching 
the mechanism for the second stage of service. In any given situation where the parameters of the serv- 
ice times and arrival time distributions are fixed and the objective function defined, the value of the 
objective function may be computed for any given set of the control variables r, R and N. Needless to 
say, the problem of deriving optimal sets of the control variables is one of considerable complexity 
and we will not pursue this subject any further. 

In a certain sense, the model discussed in this paper may also be considered as a queueing process 
involving a variable number of servers when it is assumed that the two stages of service are operated 
by two different servers. Queueing processes involving a variable number of servers have received 
some attention in literature, see for example (2); however, in those models every customer is given the 
same degree of service and two or more customers may be served simultaneously, whereas this is not 
so in the present case. 

REFERENCES 

[1] Cox, D. R. and W. L. Smith, Queues (Methuen and Wiley, 1961). 

[2] Moder, J. J. and C. R. Phillips, Jr., "Queueing with Fixed and Variable Channels," Operations 

Research, Vol. 10, No. 2 (Mar-Apr., 1962). 
[3] Morse, P. M., Queues, Inventories and Maintenance (John Wiley and Sons, New York, 1958). 



QUEUEING WITH VARYING DEGREE OF SERVIGE 523 

[4] Naor, P., "On a First-Passage Time Problem in Chemical Engineering," University of North 

Carolina, Institute of Statistics, Mimeo Series No. 400 (Aug. 1964). 
[5] Saaty, T. L., Elements of Queueing Theory (McGraw-Hill, New York, 1961). 
[6] Yadin, M. and P. Naor, "On Queueing Systems with Variable Service Capacities," Nav. Res. 

Log. Quart. 14,43-53(1967). 



MAD: MATHEMATICAL ANALYSIS OF DOWNTIME 



Edward B. Brandt 

The RAND Corporation 
Washington, D.C. 

and 

Dilip R. Limaye 

Decision Sciences Corporation 
Jenkintown, Pa. 



ABSTRACT 

The MAD model presents a mathematical treatment of the relationship between air- 
craft reliability and maintainability, system manning and inspection policies, scheduling 
and sortie length, and aircraft downtime. Log normal distributions are postulated for sub- 
system repair times and simultaneous repair of malfunctions is assumed. The aircraft 
downtime for maintenance is computed with the distribution of the largest of k log normal 
distributions. Waiting time for maintenance men is calculated either by using a multiple- 
channel queuing model or by generating the distribution of the number of maintenance 
men required and comparing this to the number of men available to determine the proba- 
bility of waiting at each inspection. 

INTRODUCTION 

The maintenance characteristics of alternative aircraft designs should be properly represented 
in cost-effectiveness trade studies through their impact on both cost and effectiveness. The cost impact of 
greater maintenance requirements is recognized through higher manning tables, increased costs of 
ground support equipment, etc. However, maintenance also influences measures of effectiveness. 
The traditional approach in aircraft reliability analyses relates malfunctions to flight hours. Because 
downtime is influenced partly by the number of malfunctions and the number of malfunctions depends 
on utilization, availability decreases with higher utilization. Therefore, downtime determines either 
the utilization consistent with a specified availability or the availability consistent with a specified 
utilization. Both of these measures are important parameters in analyses of effectiveness. 

The MAD model relates the downtime of an aircraft to the component or subsystem repair times, 
maintenance occurrences, inspection philosophy, flight scheduling, sortie length, and the number 
of men assigned to the maintenance facility. 

The model can be used to: 

• Determine the dowtime implications of changes in reliability (R) and maintainability (M). 

• Evaluate the impact of additional maintenance men on aircraft waiting time and determine 
the optimum number of maintenance men. 



*Research on this paper was performed by the authors at the Vertol Division, The Boeing Company, Philadelphia, Pa. 

525 



526 



E. B. BRANDT AND D. R. LIMAYE 



• Determine the effect of alternative inspection philosophies on downtime. In particular, the 
model is sensitive to the flight hours between inspections and the amount of deferred maintenance 
for each type of inspection. 

• Evaluate the surge capability of the system; i.e., the ability to fly more than usual for a given 
period of time. 

GENERAL DESCRIPTION OF MODEL 

The model comprises several subroutines. The overall flow diagram of the model is shown in Figure 1. 



ELAPSED MAINTENANCE TIMES 
PER ACTION 



MAINTENANCE ACTION RATES 



FLIGHT HOURS BETWEEN 
INSPECTIONS 



NUMBER OF INSPECTIONS 



PROBABILITY Or DETECTING 
FAILURE AT INSPECTION 



NUMBER OF MAINTENANCE MEN 
NUMBER OF OPERATIONAL AIRCRAFT 



NORS DOWNTIME (D ) 
ADMINISTRATIVE DELAYS (D„) 



AVAILABILITY 

DOWNTIME PER FLIGHT HOUR 



NUMBER OF WORKING HOURS 
IN A MAINTENANCE DAY 



NOT OPERATIONALLY READY 
SPARES 



U = UTILIZATION 



UTILIZATION 



PROBABILITY OF K 
MAINTENANCE ACTIONS 
AT Ith INSPECTION 
PR(K,I) ,1=1, . ,N 



DOWNTIME PER 
FLIGHT HOUR 
(D=D„*D, >D, ♦ D.+D., 



UTILIZATION/ 
AVAILABILITY 
A = f (U,D,m) 



EXPECTED MAINTENANCE 
DOWNTIME DUE 
TO LONGEST OF 
K SIMULTANEOUS 
MAINTENANCE ACTIONS 



DOWNTIME IN 
LOOK PHASE OF 
INSPECTION FOR 
EACH TYPE OF 
INSPECTION (D, ) 



AVAILABILITY 



AIRCRAFT DOWN- 
TIME DUE TO 
MAINTENANCE 
ACTIONS (D„) 



PROBABILITY OF 
WAITING DUE TO 
INSUFFICIENT NUMBER 
OF MAINTENANCE MEN 



EXPECTED 
DOWNTIME 
DUE TO 
WAITING (D ) 



Figure 1. MAD model flow chart 



SPECIFIC SUBROUTINES 

Probability Distribution of Maintenance Actions 

The model is formulated under the assumption that most failures are repaired at inspections. 
In other words, the aircraft is not considered to abort its mission and immediately return to base 
for every failure. Failures causing a mission abort are treated separately and assumed to be repaired 
individually. All other maintenance actions are located and repaired at scheduled intervals such as 
preflight, postflight, daily, and periodic inspections; multiple actions can be repaired simultaneously 
when the required maintenance men are available. 

The Poisson distribution describes the number of failures occurring at each inspection. The funda- 
mental assumption underlying the use of a Poisson distribution is that individual failures are inde- 
pendent. Given independence, it can be proved that, for a complex system of many components, the 



MATHEMATICAL ANALYSIS OF DOWNTIME 527 

probability distribution of the time between actions is exponential, regardless of the individual failure 
law for each component [1]: The distribution of the number of actions in a specified time period is 
Poisson when the time between actions is exponential. 

Therefore, the probability of x failures occurring at the ith inspection is given by 

Hj(x) = ; , 



where A = failure rate (occurrences per flight hour), 

and U = flight hours between inspections (i — 1) and i. 

It is not assumed that all inspections are so thorough that every failure will be detected and then 
repaired. For example, a certain inspection may not check for transmission leaks or hydraulic pressure. 
Additionally, some failures are not flight-critical and are deferred to later inspections. All failures not 
repaired at the inspection directly after their occurrence are considered undetected or deferred. These 
failures are noted and examined at succeeding inspections until there is a major inspection in the 
inspection cycle, at which time all previously undetected actions are detected and repaired. 

The probability of repair at the ith inspection (Pi) can be derived either empirically or on the basis 
of the inspected components and their relative occurrence. The probability distribution function of 
repairing k failures at the ith inspection, given that x have occurred, is binomial: 

(1) Pi(kix) = ( x k )p*(i-Pi)*-K 

Because the distribution of failures occurring is Poisson, the marginal distribution of the number 
of actions repaired at the ith inspection is 

(2) P i (k) = ^P i (klx)P i (x) 

x = k 

{Pi\ti) k e-''^'i 



k\ 

The distribution of the number of maintenance actions repaired at the ith inspection is therefore 
Poisson with mean Pjktj. 

The probability distribution of the number of failures that are not repaired is Poisson with mean 
(1 — Pi)\tj. In the notation defined above, the unrepaired failures for the first inspection (U\) are 

(3) £/, = (^-jA) *>'■ 

where rh\ = expected number of actions repaired at inspection 1 = P\kt\. 



528 E. B. BRANDT AND D. R. LIMAYE 

The available actions at the second inspection include the unrepaired failures from the first 
inspection plus any failures occurring between the first and second. The probability of repair at the 
second inspection is applied to all available actions. Therefore, 



2 = Pi | [ p ) m l + kt * 



(4a) 

In general, 

(4b) 



mi 



and £/ 2 = ( — n — ) m *- 



m» = P n \ ( — p — — | m„-i f \t„ 



and U„ = ( —p — - ) nin. 



Consider a constant injection of new actions occurring at a rate X per inspection and a constant prob- 
ability of repair P at each inspection. The expected number of unrepaired failures at the nth inspection 
U n is given by: 

(4c) Un=f,a-pyx- 

2=1 

As n becomes large, 

(4d) Lim Un=(^r-)x 0<P^l, 



and the expected number of repaired actions is 

(4e) Limm n =pl(j : jr~)x+x\=X. 

In cases where there is a major inspection (re+1) in the inspection cycle where all previously unde- 
tected actions are detected and repaired, we have P n +i = 1 and therefore, 

TOn+l = j( — p — " j m„+A.i*„ + i > 

and £/„ + ! = ( _ J m n + l = 0. 

The succeeding (n+1) inspections will then follow the same pattern as inspections 1 to (ra+1), and 
thus a steady-state cyclical pattern evolves. 

Equations (4a) through (4f) imply an assumption that an undetected failure is as likely to be detected 
in a succeeding inspection as a new failure. This assumption is not critical to the computation of down- 
time. When the total number of inspections in the inspection cycle is small and the probabilities Pi are 
large, the total aircraft downtime calculation is not significantly affected if an alternative assumption 
is made that a failure not detected at an inspection remains undetected until the major inspection. 



(4h) U 2 =(^- r J -)m 2 + (^- 5 -^)fh l , 



MATHEMATICAL ANALYSIS OF DOWNTIME 529 

In the latter case, the equations for the undetected failures and repaired actions will be given by: 
(4g) m, = P. 2 \t 2 , 

>. 

(4k) m„ = P„\t n , 

(4 P ) U "=l( 1 -^ 1 )^ 

(4q) m n + i = A.t„ + i+ ^ I J™-], 

(4r) U n+l = 0. 

Distribution of the Longest Maintenance Action 

The aircraft is out of commission for only the longest individual repair time when maintenance 
items are repaired simultaneously. It is assumed that the only limitation to simultaneous repair results 
from insufficient manpower; this is accounted for in maintenance waiting times. Downtime from active 
maintenance at each inspection (Dj) is given by 

(5) Di=2 E>-(k)Pi(k), 

k = l 

where El(Ic) — expected repair time of the longest of A; simultaneous maintenance actions. 

The probability distribution of maintenance actions Pi(k) is given by Eq. (2). The expected value of 
the longest repair time is 



(6) 



Edk)=j\f y [F{y)ydy = j~yk[F(y)Y->f(y)dy, 



where F(t ) = the cumulative distribution function of the time to repair a single action. 

A numerical solution can be easily obtained if one assumes that each subsystem or component will 
always be repaired in the same time. However, this assumption is unrealistic, especially when higher 
levels of agregation are necessary. The chosen method of solution uses a continuous probability density 
function to describe repair times. By this method, the variability within as well as between subsystems 
can be properly represented. 

The log normal distribution was selected to represent the times to repair for the following reasons: 

1. The log normal is a flexible, two-parameter distribution which is skewed to the right. The 
probability of zero repair time is always zero. 

2. The log normal distribution has been shown to fit well to reaction times of humans to more 
complicated perceptual patterns involving some degree of learning [2]. 

3. The actual distributions of repair times for individual systems and the entire aircraft compare 
favorably to the log normal for both helicopters and fixed-wing aircraft. 



530 E. B. BRANDT AND D. R. LIMAYE 

The log normal distribution is given by 

where cr= standard deviation of y, and 
y,„ = median of y. 

If Yu yii • • • •» y» are samples from a log normal distribution, the distribution of the largest, 

Y*{=MAX(yi,y>, . . . ., y„)},is given by 

(8) fr*(y) = n[F Y (y)]»-*fy(y), 
where 

(9) Fy(y') = I "' — \j= exp{--^ ln« ( X ) \dy. 

Jo yo"V27r I 2o- \y m /J 

Substituting this result in (6), we get 

(10) E L (k)= (' yfy*(y)dy= [" yA[Fy(y) ]*-/»■ (y)dy, 

Jo Jo 

where fy(y) and Fy(y) are given by (7) and (9), respectively. 

The integral in (10) is not easy to evaluate; however, Gumbel [3] has derived approximate results for 
different values of A;. These results were used in the model. 

Maintenance Waiting Times 

This category accounts for the additional downtime resulting from the inability to achieve com- 
plete maintenance concurrency because of insufficient manpower. It is obvious that waiting times are 
influenced significantly by the distribution of failure occurrences. Poor planning of the assignment of 
aircraft or erratic demands for their use can lead to substantial waiting in one period and idle time 
in another. 

Two approaches have been programmed with the hope of obtaining a most probable value and a 
lower limit for maintenance waiting time. The approach which yields the most probable value assumes 
that aircraft will arrive simultaneously at the maintenance facility with the requirement that the 
combined actions be processed at once. The second approach assumes perfect aircraft scheduling in 
which the expected number of actions arriving at the maintenance facility is constant per unit time 
period. 

Bulk Arrival of Maintenance Actions — Each aircraft is assumed to go through the identical series 
of inspections throughout the entire inspection cycle. The inspection cycle may last for a number of 
periods, after which the aircraft is processed through a major inspection and is considered returned 
to a new condition. When many aircraft are assigned to the same facility, the combined maintenance 
actions must be processed together to avoid delays. Each aircraft is assigned a position in the inspection 
cycle. The distribution for the combined actions is Poisson with mean equal to 

n A 

(11) m T = ^ niy, 



MATHEMATICAL ANALYSIS OF DOWNTIME 531 

where fhij = expected number of actions for aircraft j at inspection i (see Eqs. (4a) and (4b)), and 
Na — total number of aircraft. 

It is now necessary to find the distribution of the number of maintenance men required, given that k 
actions are to be repaired simultaneously. Consider the aircraft as a collection of tasks each requiring 
a particular number of men. It is possible to construct a distribution of the number of men per task. 
By employing the central limit theorem, the distribution of the average number of men required is 
normal, given that the number of actions is large. The central limit theorem is applicable here because 
the region in which manpower shortage exists occurs only at high values of k. Given k actions, the 
distribution of the total number of men required can be assumed normal with mean ix m , k and standard 
deviation <r m ,A-, where 

IJL m , k = kE(A), and 

tTm.k — O-A'Vk, 

where E(A) — expected value of number of men per task, and 
(j a — standard deviation of number of men per task. 

The aircraft must wait if the number of men required (M) exceeds the number available (Mo). There- 
fore, the probability of waiting given that A actions have occurred is 

(12) P w {k)=P[M > M \k] = ( X f M {m)dm 

JMo 

= = exp -- \dm. 

0- m ,A-V277- JMo I z \ (Tm,k I J 

The total probability of waiting is 

P, = ^P w (k)P(k) 

k 

kf k\ (r m ,kV2^ I 2\ a m ,k J J 

This probability can be evaluated at each inspection. To determine the expected waiting time, it is 
assumed that the repair time for a task is independent of the manloading. Then the waiting time when 
the number of men required is m is given by 



(14) W(m) = 

where ^r= average elapsed repair time. 
The expected waiting time is then given by 



■2.-1 

M 



Rt , 



( 15 ) g[r<»)]- f 2 (m *'~" fir-i]«-— J -r"t \A[*=***) t W 



532 E. B. BRANDT AND D. R. LIMAYE 

Continuous Arrivals of Maintenance Actions. This approach uses the standard multiple-channel 
waiting-time model. The required input information is the arrival rate, service time, and the number of 
channels. For our problem, the maintenance actions represent the arrivals; the repair time is the service 
time; and the number of maintenance teams is the channels. 

The arrival rate is a function of the occurrences per flight hour, flight hours per unit time period, 
number of aircraft assigned to maintenance facility, and the total maintenance working hours (clock 
time) per unit time period. The number of channels is equal to the total men assigned to the maintenance 
facility divided by the average men per action. The service time distribution is given by the distribution 
of elapsed repair times. 

The main advantages of the queuing theory approach are its simplicity and the availability of 
computational procedures to determine solutions. The main disadvantage is the assumptions that are 
forced upon the problem in order to facilitate the mathematical solution. 

These assumptions are as follows: 

1. Arrival rate — The expected number of actions arriving at the maintenance facility is constant 
per unit time interval. This assumes nearly perfect scheduling which ensures a constant demand rate 
for maintenance men. 

The probability distribution of the time between arrivals is exponential. 

2. Service time — The expected service time for an action is equal to the average time to repair 
(ATTR). 

The probability distribution of repair times is assumed to be exponential. 

3. Numbers of channels — The number of men per task is constant. 

Under these assumptions, the standard multiple-channel queuing model is applicable and the 
expected value of waiting time can be easily determined. 

Let A. = arrival rate, 

/x — mean service time, 

s = number of channels, 

then the expected waiting time is given by Ref. [4]: 

E((o) =~ 



sld-p) ' 



where n , A. , 

P — density = — , and 

fJLS 



P = probability that zero teams are working 



-Y (- 



2^r + 



;! »!(1-P) 



MATHEMATICAL ANALYSIS OF DOWNTIME 



533 



Comparison of the Two Methods. The bulk-arrival method is the more realistic representation of 
actual conditions. Furthermore, it is compatible with the assumptions made in other sections of the 
model because it considers the sortie length and the distribution of men per task. The major disad- 
vantage is that the number of working hours between aircraft arrivals is required as input. 

The queuing-theory approach depicts a highly optimistic system wherein maintenance actions are 
arriving continuously at a constant rate. These assumptions contradict the maintenance downtime 
routine which assumes the simultaneous repair of tasks when the aircraft undergoes an inspection. 
Furthermore, the service time distribution is considered exponentially distributed with a constant 
manloading while the maintenance downtime routine assumes log normal service times with variable 
manloading. The queuing-theory approach can be refined by assuming other distributions of service 
times, but then the computational procedures become vary complex. 

RELATION BETWEEN AVAILABILITY, UTILIZATION, AND DOWNTIME 

The relationship between availability and utilization has been traditionally expressed as shown 
in Figure 2. This approach assumes downtime per flight hour to be fixed. 



AVERAGE 
UTILIZATION 
PER DAY 




DOWNTIME = d, 



DOWNTIME = d. 



d 2> d l 



AVAILABILITY 



Figure 2. Traditional A/U plots 

The results of the MAD model show a somewhat different picture. Availability no longer responds 
linearly to average daily utilization. This occurs because some inspections are on a calendar basis, or 
before and after each flight. When utilization increases, downtime per flight hour increases and forces 
the curve to bend back because of the additional waiting time accompanying the longer flight program 
(see Figure 3). The curves designated T\, T 2 , r 3 ,andr 4 represent larger manning levels (T4>T 3 >T2>Ti). 




AVERAGE 
UTILIZATION 
PER DAY 



AVAILABILITY 

Figure 3. MAD A/U plots 



SUMMARY 

The MAD model incorporates a new approach for the determination of the interrelationship 
between aircraft malfunction rates and repair times, inspection philosophy, and number of men as- 
signed to the maintenance facility; and between availability, utilization, and downtime. The model is 
therefore suitable for evaluating the cost-effectiveness implications of changes in subsystem failure 
rates and maintenance times. It can also evaluate different inspection philosophies (such as flight-hour 
inspections versus calendar-day inspections) and determine the optimum number of men required 



534 E. B. BRANDT AND D. R. LIMAYE 

to achieve a specified level of availability and utilization. Such analyses have traditionally been per- 
formed with detailed simulations of the maintenance environment incorporating Monte Carlo tech- 
niques. In the MAD model, attempts have been made to represent the significant characteristics of the 
real-world situation as accurately as possible using probability distributions. The model is thus simple 
and easy to use, and requires significantly lower running time than Monte Carlo simulations. The 
level of detail in this model makes it suitable for tradeoffs and sensitivity analyses in the concept- 
formulation and early contract-definition phases. 

The MAD model has been programmed and is operational on an IBM 360/65. It has been validated 
by testing it against actual maintenance data from Army CH-47 helicopters at Fort Rucker, Alabama. 
The model is now being used for subsystem tradeoffs and sensitivity analyses for the HLH (Heavy-Lift 
Helicopter) and LIT (Light Intratheater Transport) Programs. 

REFERENCES 

[1] Drenick, R. F., "The Failure Law of Complex Equipment," J. Soc. Indus, and App. Math. (Dec. 

1960, pp. 680-690. 
[2] Goldman, A. S., and T. B. Slattery, Maintainability: A Major Element of System Effectiveness 

(John Wiley and Sons, New York, 1967). 
[3] Gumbel, E. J., Statistics of Extremes (Columbia University Press, New York, 1960). 
[4] Hillier, F. S. and G. J. Lieberman, Introduction To Operations Research (Holden Day, 1967). 



A METHODOLOGY FOR ESTIMATING EXPECTED USAGE OF REPAIR 
PARTS WITH APPLICATION TO PARTS WITH NO USAGE HISTORY 

Sheldon E. Haber and Kosedith Sitgreaves* 

The George Washington University 

School of Engineering and Applied Science 

Institute for Management Science and Engineering 



ABSTRACT 

In this paper a model is presented which focuses on the difficult problem of pre- 
dicting demands for items with extremely low usage rates. These form the bulk of repair 
parts in military systems. The basic notion underlying the model is the pooling of usage data 
for common design items with movement for the purpose of estimating usage rates for similar 
items which have shown no movement. 

A unique feature of the model is that it also makes possible the estimation of usage rates 
for items newly introduced into a system for which no previous usage history is available. 

0. INTRODUCTION 

The problem of predicting demands for individual repair parts in military inventory systems has 
received much attention over the last two decades. This problem is a complicated one because of the 
sporadic nature of demands for military repair parts. For most repair parts, no demands are registered 
over long periods of time and when items are demanded, they are generally demanded only once or 
twice. This fact has now been documented by many usage studies t and is once again documented 
in this study. To illustrate the nature of the demand problem under consideration, usage data for 61 
submarine patrols are shown in Table 1. As may be seen from the first entry in the table, no usage 



TABLE 1. Distribution of 25,138 Different Repair Parts by the Number of Patrols in Which They 

Were Demanded 



Number of 


Frequency of 


patrols 


different parts 





21,597 


1 


1,776 


2 


673 


3 


333 


4 


199 


5 


134 


6 


96 


7 


81 


8 


62 


9 


32 


10 


28 


11-61 


127 


Total 


25,138 



*Columbia University. 

tFor a review of this literature, see Henry Solomon. "A Summary of Logistics Research Project's Experience With Problems 
of Demand Prediction" in [1 ]. 

535 



536 S. E. HABER AND R. SITGREAVES 

was recorded for the vast majority of items, i.e., the vast majority of items was not demanded in any 
one of the 61 patrols. Of those that were demanded, one-half were demanded in exactly one patrol. 
Thus for most repair parts with usage, the problem of estimating usage rates is a difficult one. This 
difficulty is compounded by an order of magnitude for the bulk of the items whose usage history shows 
zero units used. 

In this paper we will be concerned with the estimation of expected usage of repair parts for the 
purpose of computing shipboard allowance lists. A shipboard allowance list is defined as the range 
and depth of repair parts to be stocked aboard ship to meet uncertain demands. The range of repair 
parts refers to the number of different items to be stocked. The depth refers to the number of units 
stocked of an item. 

Given that repair part usage is sporadic, several demand prediction strategies are available. The 
most widely practiced approach is that of employing usage estimates provided by technicians, i.e., 
supply personnel responsible for provisioning of repair parts. In practice, these have been found to 
be conservative and lead to relatively expensive stockage lists. Such conservatism, however, is pre- 
ferred to a much more extreme approach which might assign a zero usage estimate to a repair part 
until positive usage is experienced. The difficulty with this latter approach is that many repair parts 
are only one-time movers. Failure to have an adequate quantity of stock aboard ship or in the supply 
system prior to the first demand can thus lead to a large range of shortages and an unsatisfactory level 
of readiness. 

Another approach that has been utilized to estimate usage of slow moving items is exponential 
smoothing [2]. In this procedure, a technician's usage estimate is generally employed as an initial esti- 
mate. Hence, the initial procurement for repair parts will be based solely on the technician's estimate 
and will thus be subject to the limitation already noted.* 

One procedure for the problem at hand is to utilize information not directly pertaining to the repair 
part being considered.! This is the approach of this paper. The information used pertains to the class 
of repair parts of which the given repair part is a member. It is assumed that usage data are available 
for the repair part class and that some of the items in that class have shown movement in the past. The 
advantage of this procedure is that it permits the pooling of demands where they have occurred and 
the use of this information for making positive usage estimates for items for which zero usage has been 
recorded. The procedure also provides an expected usage estimate for new items being introduced 
into the inventory system for which no usage history is available. 

The criterion used in this study for defining repair part class is that of nomenclature of which 
resistor, washer, motor, and valve are examples. It should be noted that within a given class, estimated 
usage rates will vary depending on the design, environment, location, etc., of each part. The usefulness 
of partitioning by nomenclature rests on the assumption that variations in usage rates within a given 
nomenclature class will be less than that among classes. In this case, the stratification of repair items 
should reduce the variance of the estimates vis-a-vis the alternative of not distinguishing items by 
nomenclature class. 

In the next section we present a theoretical model for estimating expected usage of repair parts, 
in particular repair parts with no usage history. Following a description of the model, goodness-of-fit 



*Furthermore, for zero-movers this procedure will quickly lead to usage estimates that are indistinguishable from zero. 
tA model of this type, which uses information pertaining to the failure behavior of the part's parent component, is described 
in [5]. 



ESTIMATING USAGE OF REPAIR PARTS 537 

tests are applied. In the final section, the model is evaluated by developing alternative allowance lists 
and comparing these lists against actual submarine usage data. 

1. THE PROBABILITY MODEL 

We consider a class C of items defined, for example, in terms of nomenclature. Let part / be any 
item classified as belonging to class C, and let y/ = 0, 1, 2, . . ., represent the total quantity of units 
demanded for part / is a specified time period T, say a total of T patrols.* We consider a probability 
model in which the quantity y for a given part (more precisely, y/) is a random variable with a Poisson 
distribution given by 

(1) Piy\d)= e ( , ' , 

y! 
where 8 (again more precisely, 8i) is the parameter of the Poisson distribution of demands for item / 
in a unit time period, in our case, a single patrol. t Note that 8 is thus the expected number of units de- 
manded for part /in a patrol. It is assumed further that demands for part / in non-overlapping periods 
of time are independently distributed. 

Our problem is to estimate 8 for any item / classified as belonging to class C. We distinguish two 
cases. In the first case, we are concerned with estimating 8 for installed items for which usage data, 
i.e., y values are available. As indicated earlier, the problem here is complicated by the fact that for 
the majority of items no usage is recorded, i.e., the observed y values during T time periods are zero. In 
the second case, we would like to estimate 8, that is, the expected usage, for items classified as belong- 
ing to class C, but being installed for the first time. In this case, no y values, zero or otherwise, are 
available. 

In both cases, it seems intuitively reasonable to assume that positive usage data for some members 
of the class should be useful in determining estimates of the 8 values for the remaining members. We 
formalize this by postulating that 8 is itself a random variable with a probability distribution over all 
items in the class C. We then use standard theory to obtain the desired estimates for both cases men- 
tioned above. 

In general, if p{8) denotes the probability distribution of 8 in the class C, and p(y, 8) the joint 
distribution of y and 8, then 

(2) P (y, 8) = p{y\8) ■ p{8) 

= p(8\y)- P (y), 

where p{y) denotes the unconditional distribution of y values for the class C, and p(8\y) the conditional 
distribution of 8, given y. ** In the first case mentioned above in which the observed y=0, 1, 2, . . .,we 
estimate 8 by 

~8 = E(8\y) 

from the conditional distribution of 8. In the second case, when y values do not exist, we estimate 8 



*For simplicity, we do not distinguish between the random variable y and the values which it assumes. Throughout, we 
also use the symbol p{ ) to represent different probability distributions. 

tThe subscript / has been omitted from y and here and in the subsequent discussion to facilitate exposition. 
**In Bayesian terms. />(0) is the prior distribution, and />(W|y) the posterior distribution. 



538 S. E. HABER AND R. SITGREAVES 

by the value E{6), the unconditional expected value of 6. In this latter case, the estimate of 6 is the 
same for all new items in the class C, while in the former case 6 varies for each item in class C depending 
on its y value. 

In considering possible distributions for 8, we assume that the class C can be extended in such a 
way that 6 can be treated as a continuous variable with a probability density function. The preponder- 
ance of y values of zero in most classes led us to consider densities whose maximum value occurs for 
6 — 0. We examined first the exponential density 

1 -9- 
p(0)=-e $ for 0<6<*>, 

but resulting calculations did not give evidence of a good fit. A natural extension, which because of 
its mathematical properties seemed particularly appropriate for the distribution of a Poisson param- 
eter, is a two-parameter gamma distribution. Accordingly, we assumed that 

pw-.»-(|)'^ fur0<9< - 

with a, /3 > 0. For any value of a < 1, this function is infinite at = 0, and is monotonically decreasing 
as 6 increases from to °°. 

From (1) and (3), Eq. (2) can be written specifically as 



a a0a + y-\ e j3 Ty 

(4) p(y,e\a,p)=p(y\e)-p(e\a,p)= ^ r{a)yl 

a+Tp\ a + y 6 a + y~ 1 e g r(a + y) a a (T($)y 



p } r(a + y) T(a)y\ (a+Tp) a + v 

= p(0|y, a, (3) ■ p(y\a, /3). 

Thus, the conditional distribution of 6, given y, also has the form of a gamma distribution, while the 
unconditional distribution of y for the class C is a negative binomial. 
From (4) and (3), we find 

for the first case where y values are available while 

(6) E(6)=p, 

for the second case where items are being installed for the first time and there are no y values. 

For any item in class C with an observed y value, we now have from (5) an estimator for the expected 
usage for the item, namely 6. The estimator can be evaluated for every y value, including zero, if we 






ESTIMATING USAGE OF REPAIR PARTS 539 

have values for the two parameters a and B. We estimate these parameters from the observed set of 
y values for the class C, treating these values as a set of independent observations from a negative 

binomial distribution with mean value TB and with variance TB ( 1 +— ) . 

Letji,y 2 , . . ., y„ be the observed y values for the n items /= 1, 2, . . ., n in class C. From the 
data we estimate the mean and variance by 

1 " 
y=-£ y.and 

i=l 
1 " 

V— _ , ^ (yi — y) 2 , respectively. 

!=1 



We estimate TB by y so that 



y 

/3 = ~ 



In estimating a, we use the method of moments since this is relatively simple and straightforward. 

/ TB\ 
Since the variance of y is TB I 1 H I, we estimated the variance as 

* / TB\ y 

7)8 I 1 H — t I and with B = f;, from above, 

obtain 



a — 



V-y 



Hence, in the case where an observed y value is available for a given item, the desired estimate of 
6 for the item is 

T'B a + y 



and in particular when y = 0, we have 





a + TB T 



~^>0. 



a + TB 

tb y 

For y > 0, as 7 becomes large, the quantity — „ approaches the value 1, and 6 approaches -=,• 

a + TB T 

In the case of new items being introduced into the inventory system, the estimated expected usage 



2 
t 

part class describing the new item. 



for the item is given by fi= - where it is seen from (6) that B is the expected value of 6 for the repair 



540 



S. E. HABER AND R. SITGREAVES 



2. EVALUATION OF GOODNESS-OF-FIT 

In the preceding section, we assumed that the distribution of expected usage for items in a given 
class C was a two-parameter gamma distribution. This, in turn, led to a negative binomial distribution 
of demands for items in the class. In this section, we examine the goodness-of-fit of the model just 
described. It will be recalled that a similar assessment of the earlier exponential model led to its 
rejection. The purpose here is not an exact test of a particular hypothesis, but rather to determine the 
reasonableness of the model finally adopted. An additional test of the model in an inventory context is 
provided in the next section. 

In examining the goodness-of-fit of the model, a large number of repair part classes were defined 
on the basis of nomenclature and for each class a. and fi were computed from the available data. 
Having obtained these estimates, theoretical negative binomial distributions of demands for items in 
each class were calculated and compared with the actual distributions of y values. The comparison of 
the actual and theoretical frequencies for each class was made by computing the value of chi-square as 
an index of goodness-of-fit. Again, it was not the purpose to use each of the chi-squares as a rigorous 
test of the corresponding null hypothesis. The intent was to utilize the chi-squares and the associated 
significance probabilities as the basis for assessing the appropriateness of the model. 

In evaluating the results, the following points should be kept in mind. First, because of the vagaries 
of reporting, no model may provide a satisfactory fit to the data. For example, extremely large y values 
may be expected as a result of mispunched data or stockpiling of material. Additionally, demands for 
repair parts are often for even numbered quantities. The prevalence of demands for even quantities 
may be seen from the distribution of y values for 61 patrols shown in Table 2. 

Table 2. Distribution of 25,138 Different Repair Parts By the Total Quantity of Units Demanded 

During 61 Patrols 



Total demand 


No. of different 


Total demand 


No. of different 


quantity " 


repair parts 


quantity a 


repair parts 


3 


249 


14 


26 


4 


249 


15 


28 


5 


121 


16 


27 


6 


124 


17 


9 


7 


86 


18 


20 


8 


97 


19 


14 


9 


58 


20 


38 


10 


61 


21 


17 


11 


36 


22 


13 


12 


57 


23 


10 


13 


29 


24 


18 



"For the total sample of 25,138 different repair parts, items with a total demand quantity of 0, 1, 2 units during 61 patrols 
were 21,597, 1,027, and 495, respectively. The number of different repair parts with a total demand quantity of 25 or more was 632. 



Second, the nature of the chi-square statistic itself is such that relatively small differences be- 
tween observed and expected relative frequencies will load to large chi-squares if the sample is large. 
For the purpose of evaluating goodness-of-fit within a demand prediction context, this relation is an 
important one. 

In performing the goodness-of-fit computations, we were able to determine the significance of 
chi-square for 54 classes of repair parts containing 18,847 different parts. For the repair classes ex- 



ESTIMATING USAGE OF REPAIR PARTS 



541 



amined, no correction was made for the phenomenon of even quantity demands. It was possible, how- 
ever, to correct for the presence of outliers. Items in a repair part class were treated as outliers and 
eliminated if the smallest y value omitted was large relative to the largest y value included. In almost 
all cases the outliers had a very low unit price, or a high total installed population, or large individual 
demand quantities, or a combination of these characteristics. For example, in the repair part class 
"filters" containing 370 different filters, one filter had a total demand quantity of 320 units — all units 
of the item being demanded in a single transaction. Of the 18,847 parts in the sample, this item and 
37 other repair parts were eliminated as outliers.* 

The results of the goodness-of-fit computations, after elimination of the 38 items considered to 
be outliers, are shown in Table 3. 

TABLE 3. Summary of Chi-Square Computations 



Different repair 
parts in class 


Number of 

repair part 

classes 


Number of classes with 
poor fit at 


0.05 level 


0.01 level 


100 or Less 

101 to 499 
500 and Over 


10 
30 
14 


1 
7 
6 


1 

3 


Total 


54 


14 


4 



Over all classes, poor fits were obtained for but 4 and 14 of the 54 repair part classes at the 0.01 
and 0.05 levels, respectively. As may be seen from Table 3, the incidence of poor fits increased as the 
number of repair parts in a class increased. In interpreting the results of Table 3, the earlier observa- 
tion that where the number of items in a class is large, discrepancies between observed and expected 
relative frequencies may still be small, should be recalled. Indeed, this was the case for almost all 
of the repair part classes where the chi-square was larger than expected on the basis of chance alone. 

3. FURTHER ASSESSMENT OF THE MODEL 

In addition to examining the goodness-of-fit of the model, shipboard allowance lists were com- 
puted using as input the demand prediction model previously described. These lists were then com- 
pared with an allowance list utilizing technicians' usage estimates, both in terms of dollar investment 
in stock and shortage counts. The purpose of this evaluation was (1) to simulate the performance of 
the model in the environment for which it was designed, and (2) to determine whether differentiating 
repair parts by nomenclature class represented an improvement over a simpler approach of grouping 
all items into a single class. 

The data base for an initial test consisted of 61 patrols of usage history. The items included in this 
initial test fall into the first category of repair parts distinguished in this paper, i.e., items for which 
usage data are available including data for items with "usage" of zero units. Employing past usage 



*The total of 38 outliers was concentrated in 16 repair part classes. No class with 100 or fewer different repair parts con- 
tained any outliers; 10 of the 30 classes with 101 to 499 parts contained outliers; while the remaining 6 classes with outliers 
came from the 14 classes with 500 or more parts. This distribution is not inconsistent with the plausible hypothesis thai the 
probability of observing an outlier in a given class increases with the number of different parts in the class. 



542 



S. E. HABER AND R. SITGREAVES 



data for the 61 patrols and the demand prediction model, usage rates were computed for each repair 
part under two procedures: (1) different a and (3 were computed for each nomenclature repair part 
class (Model II A), and (2) a single value of a and /3 was used for all repair parts regardless of nomen- 
clature class (Model II B).* Allowance list quantities were then computed for these procedures and the 
one incorporating technicians' usage estimates (Model I) using the inventory model described in [4]. In 
all cases the inventory model was used with the same parameters. Thus, the only difference in the com- 
putation of the allowance lists was the technique used for deriving usage estimates. The allowance list 
quantities were next compared against usage data during a subsequent 21 patrols. The data for these 
patrols were not used in the initial calculation of usage rates. After each new patrol the model allow- 
ance list quantities were updated. No updating procedure was available for the quantities computed 
using the technicians' estimates. 

Summary data describing the allowance list computed for items with previous usage history are 
shown in Table 4. As may be seen, Model I was about three times as expensive as the other two models. 
In terms of depth or number of units stocked, Model I stocked almost five times as many units as the 
other models. In terms of range or number of different items stocked, both Models I and II A stocked 
more items than Model II B. Thus, one effect of distinguishing among repair part classes on the basis 
of nomenclature was to increase the range of repair parts stocked by the model. 



Table 4. Range, Depth, and Dollar Value of Investment: 
Items With Previous Usage History a 



Model 


Range of 

items 
stocked h 


Depth of 
units 
stocked c 


Dollar 
value 


I 

II A 
II B 


18.6 
18.9 
16.0 


112.3 
25.4 
22.5 


2,703.1 
960.4 
854.7 



a Averages for 21 patrols. All figures in thousands. 
b Number of different repair parts stocked. 
c Number of units stocked. 

The average range or number of different items with a shortage and the average depth or number 
of units short per patrol are shown in the first and second column, respectively, of Table 5. It should 
be remarked that the latter measure is not without difficulty of interpretation due to the problem of 
mix of different units of measure among items, e.g., some items are measured in feet while others are 
in units of "each." For the sake of completeness, however, this measure is included as an alternative 
measure of performance. 

In Table 5, shortage counts are provided separately for items not stocked and for items stocked. 
These two categories of stock are distinguished since items in the former category tend to be "not 
carried" over successive patrols. From Table 5, it is seen that for items stocked, there were on the 
average 19.5 and 23.1 different items with a shortage per patrol for Models I and II A, respectively. 
Over all items with a shortage, the total number of units short averaged 172.3 and 225.0. In terms of 
the number of units short per item short, Model I averaged 8.8(172.3^-19.5) as compared to 9.7 
(225.0 h- 23.1) for Model II A. 






Tor Model II B, a = 0.00787 and /&= 0.02414. 



ESTIMATING USAGE OF REPAIR PARTS 



543 



Table 5. Shortage Counts: Items With Previous Usage History 3 



Items 


Shortages: All items 


Range b 


Depth c 


Not stocked: 
Model I 
Model II A 
Model II B 


2.5 (2.2) 
3.0 (2.8) 
6.7 (4.0) 


6.8 (13.3) 

4.8 (5.1) 

12.1 (8.8) 


Stocked: 
Model I 
Model II A 
Model II B 


19.5 (12.8) 

23.1 (14.3) 

21.2 (13.8) 


172.3 (132.2) 
225.0 (196.3) 
219.6 (195.0) 



a Averages for 21 patrols. Standard deviation in parentheses. 
11 Number of different repair parts with shortages. 
c Number of units short over all repair parts. 

A second test similar to the one described above was performed for 4,094 items which were treated 
as new items being introduced into the system. It should be noted that none of these items were in- 
cluded in the previous test. Following the model, in developing usage rates for this test, only the param- 
eter j8 was used. For Model II A, /3 varied from class to class; for Model II B, f3 was invariant for all 
items. In each case, the /3 value used was the same /3 value employed in the first test. Thus this second 
test was a more stringent one in that not only were inventory quantities matched against unknown 
future usage (for 35 patrols), but in estimating item demand distributions the input data were from a 
completely different set of repair parts. 

Summary figures describing the allowance lists and shortage counts for items which were treated 
as new items being introduced into the system are shown in Tables 6 and 7, respectively. The format of 
these tables is the same as for Tables 4 and 5. 



Table 6. Range, Depth, and Dollar Value of Investment: 
Items With No Previous Usage History a 



Model 


Range of 
items stocked b 


Depth of 
units stocked c 


Dollar 
value 


I 

II A 
II B 


3.9 
3.7 
3.2 


18.9 

4.4 
3.8 


450.2 
231.8 
145.2 



a Averages for 35 patrols. All figures in thousands. 
b Number of different repair parts stocked. 
c Number of units stocked. 

From Table 6, one notes that as in the case for items with previous usage history, Model I was the 
most expensive one. The additional dollar value of investment for Model I was once again accounted 
for by the large number of units stocked, given that an item was stocked. Likewise, the range of dif- 
ferent items stocked was least for Model II B. 

An examination of Tables 5 and 7 indicates that for items not stocked, Models I and II A performed 
about the same; Model II B performed less well than the other models because of its reduced range of 
items stocked. For stocked items, however, Model I performed better than the other models; the per- 
formance of Models II A and II B was very similar. Thus, on the basis of the shortage measures alone, 
Model I was ranked higher than Model II A because it had fewer shortages for stocked items. Model 
II A was ranked higher than Model II B because it had fewer shortages for nonstocked items. 



544 



S. E. HABER AND R. SITGREAVES 



Table 7. Items With No Previous Usage History 



Items 


Shortages: All items 


Range b 


Depth 1 


Not Stocked: 
Model I 
Model II A 
Model II B 


1.0 (1.2) 

0.9 (1.4) 
6.0 (6.4) 


1.7 (2.1) 

1.5 (2.0) 

10.1 (10.0) 


Stocked: 
Model I 
Model II A 
Model II B 


3.6 (3.5) 
6.1 (4.3) 
5.3 (4.0) 


33.8 (53.6) 
48.4 (58.8) 
44.6 (57.7) 



a Averages for 35 patrols. Standard deviation in parentheses. 
'' Number of different repair parts with shortages. 
c Number of units short over all repair parts. 

One should note that the difference in performance between Models I and II A was small. Model 

I had 3 to 4 fewer items with a shortage per patrol; given a shortage, the number of units short per 
item short was at most one less for Model I. On the other hand, the difference in investment cost be- 
tween the two models was substantial. Model I was approximately 2 to 3 times as expensive as Model 

II A. The difference in performance between Models II A and II B, was about the same as that between 
Models I and II A. In terms of investment cost, however, Model II B was somewhat less expensive 
than Model II A. 

The finding of small differences in performance between models is reinforced by an examination 
of shortage counts for those repair parts which were highly essential.* Shortage counts for this class 
of items are found in Table 8. As may be seen, for these items, with the exception of the depth shortage 
measure for stocked items with no previous usage history, all models performed about the same. 



Table 8. 


Shortages 


for Highly Essential Items 




Highly essential items 3 


With previous 


With no previous 


Items 


usage history 


usage history 


Range 


Depth 


Range 


Depth 




(1) 


(2) 


(3) 


(4) 


Not stocked: 










Model I 








0.1 


0.2 




(0) 


(0) 


(0.2) 


(0.7) 


Model II A 
















(0) 


(0) 


. (0) 


(0) 


Model II B 
















(0) 


(0) 


(0) 


(0) 


Stocked: 










Model I 


2.2 


19.1 


0.1 


1.7 




(3.4) 


(27.4) 


(0.2) 


(7.8) 


Model II A 


2.4 


23.3 


0.4 


5.1 




(3.7) 


(33.9) 


(0.6) 


(11.2) 


Model II B 


2.3 


23.2 


0.6 


5.5 




(3.6) 


(33.9) 


(1.2) (11.8) 



1 Averages for 21 patrols in Cols. 1 and 2; Averages for 35 patrols in Cols. 3 and 4; standard deviation in parentheses. 
*A discussion of military essentiality coding of repair items is found in [3]. 



ESTIMATING USAGE OF REPAIR PARTS 545 

Based on the findings of this section, we conclude that relative to the substantial difference in 
investment cost between Model I and Model II A, the difference in performance between these two 
models was small. Considering cost as well as performance, Model II A was judged superior to Model I. 
Because of the fewer shortage counts for Model II A vis-a-vis Model II B and the similarity of costs 
between them, Model II A in which items were distinguished by nomenclature class was judged superior 
to Model II B where all items were lumped into a single clsss. 

4. SUMMARY 

In this paper a model is presented which focuses directly on the difficult problem of predicting 
demands for items with extremely low usage rates, which form the bulk of repair parts in military 
systems. In the model, repair part demands are assumed to be Poisson distributed while their means 
are assumed to be gamma distributed. A basic notion underlying the model is the pooling of usage data 
for items that have shown some movement for the purpose of estimating usage rates for those items 
which have shown no movement. 

At the outset, repair parts were partitioned into different classes. An assessment of goodness-of-fit 
was performed for 54 different classes of items to determine whether the unconditional distribution of 
demands was indeed negative binomial distributed as postulated by the model. Given the vagaries 
of the data, e.g., disproportionately large numbers of even demands and large outliers due probably 
to mispunched data and stockpiling of material, the model fit the data quite well. Although the parti- 
tioning of repair parts was not essential to the model, it was assumed that such partioning would yield 
improved estimates of usage rates. The goodness-of-fit computation and other tests conducted in an 
inventory context suggest that this indeed was the case. 

A unique feature of the model is that in addition to providing positive usage estimates for repair 
parts with previous usage history, regardless of whether or not a particular part was observed to move, 
the model also makes possible the estimation of usage rates for new items for which no previous usage 
history is available. In an inventory context under stringent test conditions, the model performed equally 
well under both contexts, when compared with the current procedure for estimating usage rates. Mean 
shortage counts for the model were slightly higher over all items and about equal for highly essential 
items as mean shortage counts for the current procedure. On the other hand, differences in cost were 
marked with the current procedure costing two to three times as much as the proposed model. As 
indicated by this study, the notion of pooling usage data, and from such data extrapolating usage rates 
for installed items with zero usage of for items being newly introduced into a system, is a useful one. 

ACKNOWLEDGMENT 

The authors wish to thank William Hise and Talbot Walls, Jr., for their programming assistance. 

REFERENCES 

[1] Astrachan, M. and A. S. Cahn, (Editors) Proceedings of Rand's Demand Prediction Conference, 

January 25-26, 1962, The Rand Corporation RM- 3358 -PR (1963). 
[2] Brown, R. G., Statistical Forecasting for Inventory Control (McGraw-Hill, New York, 1959). 
[3] Denicoff, M., J. Fennell, S. E. Haber, W. H. Marlow, F. W. Segel, and H. Solomon, "The Polaris 

Military Essentiality System," Nav. Res. Log. Quart. 11, 235-257 (1964). 



546 

[4] Denicoff, M., J. Fennell, S. E. Haber, W. H. Marlow, and H. Solomon, "A Polaris Logistics Model," 
Nav. Res. Log. Quart. 11, 259-272 (1964). 

[5] Haber, S. E., R. Sitgreaves, and H. Solomon, "A Demand Prediction Technique for Items in Mili- 
tary Inventory Systems," Nav. Res. Log. Quart. 16, 302-308 (1969). 



A STOCHASTIC CONSTRAINED OPTIMAL REPLACEMENT MODEL* 



Peter J. Kalman 



Department of Economics 

State University of New York 

Stony Brook, N.Y. 



ABSTRACT 



In this paper a stochastically constrained replacement model is formulated. This model 
determines a sequence of replacement dates such that the total "current account" cost of 
all future costs and capital expenditures over an infinite time horizon for the n initial incum- 
bent machines is minimized subject to the constraints that an expected number of machines 
are in a chosen utility class at any point in time. We then indicate one possible solution method 
for the model. 



I. INTRODUCTION 

The paper is structured as follows. We first define the basic notation in section 2. Next, in section 3, 
an analytical model is developed- which determines a sequence of replacement dates for the ith initial 
incumbent machine (f=l, 2, . . ., n) such that the total "current account" cost of all future costs 
and capital expenditures over an infinite time horizon is minimized subject to the constraint that at 
any point in time there exists a desired expected number of machines in any chosen "utility" class. 
We then discuss one possible solution (out of many) under specified assumptions. 

II. NOTATION 

Suppose there exist n initial incumbent machines. In order to define the model, the following 
notation is useful: 
uj, the utilization rate of the jth machine in the sequence of replacements for the £th initial 

incumbent machine, i=l, . . ., n, j=l, 2, . . . (from here on the underlined is 

abbreviated by MR I My) ; 
Li, the age of MRIMy when the decision to replace it is made; 

lj, the age of MRIMy when it is replaced; 

t, the age of MRIMy, s£ r «£ /]; 

C'j, the utility class of MRIM,j (Cj=l, 2, . . ., q in decreasing order of desirability. That 

is, each machine is represented by an ordinal measure of its utility characterized by one 

of the above integers. Moreover, every machine belongs to one and only one utility class 

at any point in time.); 
Rp a parameter indicating whether MRIMy is new or a modernization 



*This work was started while the author was on the professional staff of the Center for Naval Analysis. It was completed 
while the author was a visiting professor at the Naval Postgraduate School. 

547 



548 P- J- KALMAN 



T 1 if MRIM is new \ 

3 [0 if MRIMy is a modernization/' 



t'j, the time when a decision to purchase a replacement for MRIMy is made, 0^££J=Soo; 

xi-(t), the expected number of machines in the ith sequence in utility class k at time t; 

Xk(t), the expected number of machines in utility class k (k — 1, . . ., q) at time t (0 =£ t =£ °°); 

Pkj(r), the probability that a machine which was in class k when it began operating will have 

moved to classy by the time it is t years old; 
t', the time the latest replacement machine in the ith sequence began operating; 

t l , the starting time of the latest modernization in the ith sequence. 

In the model to be presented, the term "replacement" includes "modernization" as well as new 
machine acquisition. The same applies to "purchase of a replacement". Hence, if MRIMy is "modern- 
ized", then the modernized version is called "new" and a "new machine" when it enters the process. 
It follows that "age" represents "duration of time in the process". The symbols L), C) and R] represent 
decision variables and Cj refers to utility class when new. 

III. THE MODEL 

The model defined below allows a three-way choice among not replacing or modernizing or building 
a new machine at any point in time. Furthermore, the model will determine a sequence of replacement 
purchase times (i.e., {t[}, j=l, 2, . . ., for each i, i = l, . . ., n) such that the total "current 
account" cost of all future costs over an infinite time horizon is minimized subject to the constraints 
that at each point in time there exist a prescribed expected number of machines in each chosen utility 
class k (k=l, 2, . . ., q) . It is assumed that there exist n initial incumbent machines. 

The operating expence (cost) function for MRIMtj is 

(3.1) flits,, tj-u t, R h Cj) 

where it is understood that the superscript outside a function applies to all the appropriate arguments 
of that function. Clearly, «j, t, i?j and C'j influence the operating expense of MRIMy. The time at 
which the decision to replace is made (tj_ 1 ) determines the technological state of advancement of 
the machine to be installed and hence can also influence </>'. 

In addition to operating cost, there is one other major category of cost which must be considered — 
the investment cost {W'f) of MRIMy. This will depend upon whether it is a new machine or a modern- 
ization and on the utility class chosen for it. That is, 

(3.2) W)(Rj,Cj). 

Since machines normally have some salvage value at the time of replacement, W'j is not the net capital 
cost of a machine. To obtain this, we subtract the salvage value (which depends on the replacement 
age) of MRIMy. Hence, net capital cost of MRIMy is 

(3.3) W^R j ,C j )-Sj(lj)e-'- l j. 

We like to note that salvage value may also depend on absolute time, but for simplicity we omit this. 
There is one other important type of element to be considered before one can formulate the "current 



STOCHASTIC CONSTRAINED REPLACEMENT MODEL 



549 



account" cost. This is the time required to accomplish a replacement of MRIM,j of age L'jif the replace- 
ment action starts at time t'j. Before defining this relation, we like to point out that if a machine is to 
be replaced by an acquisition, it will remain in service till the new arrival, whereas it may be immedi- 
ately withdrawn if it is to receive modernization. The time required to accomplish a replacement of 
MRIIVLj of age L^ if the replacement action starts at time t'j\s represented by 



yj(Aj> *>)■ 



The L) and the t'j are related by 

l ^0' 



t\ = ti ,+yi ,(Lj-i,tj-i)+L i .,j^2,i=l,2,. . ., n. 
For future notational convenience, let y\ represent 

y'.(Lj, tj),j = 0,l,2, . . .,i=l, . . .,n. 

Define r as the continuous rate of discount. Also, define the current account cost (w') as the outlay 
stream that has the same present value as all the cost outlays associated with the ith initial incumbent 
machine in an infinite chain of replacements. That is, 



(3.4) 



w 



["foiuo, t, R„, C )e"dr- Sh(l )e- rli 



i r r '' 

?-r<t l + y' ) I 
Uo 



4- p-tit' +yi ) 
> c m m 



y,(ui, t , r, /? 1 ,C 1 )e-^T+rj(/? 1 ,C 1 )-S' 1 (/,)e- r 'i 

<f>' m+i (u m +i, t m , T, R m +i, C rn +l)e~ rT dT+ W' m+1 (R m+ i, C„,+i) 

^m + 1 V 'm+1 /<? m+1 



+ 



where i= 1, . . . , n, 



/' = 



L' m+i if the (m+ l)st machine's replacement in the ith sequence is a modernization, 

L'm+i + 7m+i if a new machine is built, m = — 1, 0, 1, 2, . . ., and <f>' {u , t, R , C ) is the oper- 
ating expense function for the ith incumbent machine. 



Note that the investment cost of the ith incumbent machine, W Q , does not appear in (3.4) since, by 
hypothesis, the incumbent machine is a sunk investment whose capital cost should not be allowed to 
influence decisions. The integrals in (3.4) discount the cost stream of each replacement back to the 
point in time when the machine was new. The exponential term outside each set of brackets then dis- 
counts all these cost streams back to the present time. Similarly, the second and third terms inside 



550 p - •<• KALMAN 

each bracket, when discounted, determines the present value of all future investment outlays net of 
salvage values. Finally, I would like to note that equation (3.4) can be reduced to its discrete analogue 
in the dynamic programming framework under appropriate assumptions. 2 

The constraint set will now be formulated. It will be formulated first as an algebraic system and 
then as a differential equation system. The choice of which formulation to use will depend upon the 
application of the model. From the definitions (of section 2) it follows that 

(3-5) *,(*) = i i (M'-MmM). 

i=l k=\ 

7=1,2, . . ., q, 
with the constraint 

(3-6) Xj{t)=Xj{t), j=l, . . ., q 

where Xj is chosen by the decision maker (we specify a way for calculating the jc's below), and 

[t 1 ] =max {V, ?}, 
£,. v _ f 1 if the machine in use at time t in the tth incumbent machine's sequence is in utility class./ 
^ W ~l0ifitis not. 

Note that %)(t) is a function of current time t. If a machine in the ith sequence is in the process of 
being modernized at time t, then £j(0 = for all/. Consider the inner summation first. £j( [t'] ) indicates 
the utility class which we selected for the machine currently in use at time t in the tth sequence when 
it was new, or tells us that a modernization is currently being undertaken, and consequently, that no 
machine in the ith sequence is currently in use (i.e., £i([£']) = for all k). For sequence i in which 
modernizations are not being undertaken at time t, g k ([«']) = g' k (V) = 1 for one and only one k. If 
[t '] = }', then t— [f] gives the age of the machine in use at time t in the ith sequence. If [f] = ~t \ 
the pkj{t— [t']) term is irrelevant since ^j f ([t']) = for all k. Thus, the inner sum tells us, for each 
sequence i in which a machine is in use at time t, the probability of that machine being in utility classy 
at time t, given the class of the machine when it was built. Then, with the outer summation, we simply 
sum over all machines in use at time t. 

In the above formulation, the constraints are functions of the replacement purchase times ([<'] is 
either a replacement purchase time or a time at which a replacement machine begins operation, which 
is related to a replacement purchase time by the function y') and the utility classes which we choose 
for the replacement machines, as reflected in the values of tj k ( ). The constraints may have a count- 
able number of discontinuities in an infinite horizon model, since it is likely that there will be a 'bump' 
every time we are able to choose a utility class for a machine (i.e., when we have a modernization or 
build a new machine). 

An alternative formulation of the constraint set will be given via a system of differential equations. 
This formulation involves the solution to a differential equation system while the above formulation 
does not. 

For each sequence i, (i=l, . . ., n) we have the following matrix differential equation: 

(3.7) ^7=P(t-[tq)X i 



dt 



See []]. 



STOCHASTIC CONSTRAINED REPLACEMENT MODEL 551 

where P is a q X q transition probability matrix of transition probabilities Pij(t— [t']), X' is a qX\ 
matrix of x'jit)^, subject to the initial condition 

(3.8) X>([t i ])=X'(0) 

where Xi, is a q X 1 matrix whose 7th element is xj( [t'~\) where 

1 if [*'] = l' and we selected utiUty class; 
*]([*']) ~ \ f° r tr >is machine when it was built, 
otherwise. 

That is, if modernization is going on at time t in the ith sequence, we do not have to be concerned about 
any transition probability between classes because no machine is in use. Otherwise, we look back to see 
in what class we put the machine in use at time t in the ith sequence when it was new. In system (3.7) 
they'th equation 

^}r=ipxj(t- M)4 (t) + (Pjj(t-[r])-i)xj («), 

al k=\ 

J=l, . . ., q 

states that the net rate of change of the number of machines in ith sequence in classy in an infinitesimal 
time period equals the total number of machines which enter class; in the ith sequence at time t minus 
the number of machines in the ith sequence which leave class; for other classes. 

If we assume that the matrix P is an analytic coefficient matrix, then our finite system of first-order 
linear differential equations with analytic coefficient functions (3.7) subject to the initial value (3.8) 
admits a unique analytic-function solution, which can be found explicitly by equating coefficients in the 
relevant power series. 3 Clearly, there are an infinite number of functions in this class which could be 
used to represent the transition probabilities as a function of time. Linear or exponential functions are 
just two elements of this class which we can use. 

One should note that we have a set of q differential equations for each sequence i. Let us denote: 

TNDM,, the time of the next decision to modernize a machine in the ith sequence; 
TNRO,, the time the next replacement machine begins operation in the ith sequence. 

For fixed i, we may have a different set of initial conditions for each interval of the form [t'] 
=st< min {TNDM,, TNRO,}, reflecting the series of decisions we can make concerning the class of the 

i 

replacement machines. Thus, for each i, we would have at most q non-trivial sets of differential equa- 
tions to solve, corresponding to the q different non-trivial initial conditions we may have. Of course, a 
solution corresponding to a given set of initial conditions may apply to different length intervals of the 
form U'} *£ t < min {TNDM,, TNRO/}. 

Thus, the x'j(t) will be functions of the replacement purchase times and the utility class decisions. 






3 See [2], pp. 89-92. 



552 P - J- KALMAN 

n 

Then Xj{t) = V x'j(t). However, we could only hope for continuity of Xj(t) for 

i=l 

max [t*] *£ t < min {TNDM,, TNRO,}. 

i i 

That is to say, xj(t) would be continuous only in intervals te (1 ([*'], min {TNDM;, TNRO,}). The 

i i 

same is true of the continuity of Xj{t) in our non-differential equation approach. For example, suppose 
we have two machines 

Machine 1 _| | i ;] I 

t\ tz \ t tz 

Machine 2 _| I I I [_ 



t 2 t f 3 



Specifically, suppose at £ 2 a decision is made to build a new machine in sequence 1, while t! 2 involves 
a decision to modernize the machine in use in sequence 2. Thus [t 1 ] — ti, [£ 2 ] = 4Thus 

n ([**], min {TNDM,, TNRO*}) = {t' 2 , h), 

i i 

as denoted by the dotted lines. It is within this interval that no decisions regarding the utility class of 
the machines are made. Consequently, no "bumps" are experienced. Instead, machines change classes 
smoothly according to the transition probabilities. 

It should be emphasized that the main factor behind the two different approaches to the constraint 
formulation is the definition of the transition probabilities. It seems likely that the first approach is 
preferable from a practical standpoint. 

If the transition probability matrix is a constant matrix P it is well known that system (3.7) has the 

following unique solution 

k m j 
(3.12) I^^/'-WK/j 

where /u-i, . . . , /u* are the eigenvalues of P with multiplicities wii, . . . , m/?, respectively and Vij is 
the eigenvector associated with /j.j. The solution (3.12) represents the number of machines in class 
1(1=1, . . . , q) in the ith sequence at time t. 

We now formulate the constrained optimization problem of determining a sequence of replacement 
times such that the total current account cost of all future costs and capital expenditures over an 
infinite time horizon of the n incumbent machines is minimized subject to the chosen constraints that 
Xk(t) machines are in utility worth class k at time t (&=1, . . ., q). That is, we want to minimize 

n 

V w' where w' is defined by (3.4) subject to the constraints Xj(t)=Xj(t), 7=1, . . ., q which are 

i=l 

defined by (3.5). One possible way to proceed is to assume that the constraints are exactly satisfied 
for all values of t and make a discrete approximation of the objective function. Then we can apply 
dynamic programming techniques in order to solve the problem. 

As an exercise we have solved the model under the following assumptions: 

(1) a constant level of utility over time; 

(2) a linear operating cost function; 



STOCHASTIC CONSTRAINED REPLACEMENT MODEL 553 

(3) a constant rate of machine utilization; 

(4) instantaneous replacement; 

(5) equal economic lives of all replacements after the first; 

(6) equal investment costs of all machines. 

IV. ACKNOWLEDGMENT 

I would like to thank G. E. Bowden, L. Ravenscroft and D. Richter, who are members of the 
professional staff of the Center for Naval Analysis for their helpful comments. 

REFERENCES 

[1] Dreyfus, S., "Dynamic Programming and the Calculus of Variations," Academic Publishing Co., 
1965. 

[2] Hochstadt, Harry, "Differential Equations: A Modern Approach," Holt, Rinehart, Winston Pub- 
lishing Co., 1964. 

[3] Hotelling, H., "A General Mathematical Theory of Depreciation," Journal of the American Statistical 
Association, p. 340, XX, September 1925. 

[4] Jorgenson, D. W., J. J. McCall, R. Radner, "Optimal Replacement Policy," Rand McNally Pub- 
lishing Co., 1967. 



A NOTE ON THE CALCULATION OF EXPECTED TIME-WEIGHTED 
BACKORDERS OVER A GIVEN INTERVAL 



Chung-Mei Ho 
AMC Inventory Research Office 



ABSTRACT 

Two formulae are presented for calculating expected time-weighted backorders over a 
fixed time interval. One formula is a more precise form of a result found in the literature and 
is found using a direct intuitive approach. The second formula is derived using the steady- 
state distribution of inventory and is directly compatible with the use of steady-state (R,Q) 
models. 

The two formulae are compared and reconciled. 

1. INTRODUCTION 

A rather old problem is considered. An inventory system begins with N units of stocks, all on hand. 
Inventory is depleted by demands which arrive in a random pattern. All demands occurring when the 
system is out of stock are backordered. What is the expected number of time-weighted backorders 
over a period of length L assuming that no stock arrives during the period LI The time-weighting is 
linear, i.e., a backorder lasting for 2 weeks counts as much as two backorders lasting for 1 week. 

Typically, of course, L represents a procurement lead time. Two solution techniques are compared. 
One approach is to deduce the answer from a result obtained in the solution of steady-state (R, Q) 
inventory models, such as described in Hadley and Whitin [2]. A more direct intuitive approach is 
presented in section 2. 

It is found that: (a) for Poisson distributed demand both approaches are equivalent though the 
formulae found are very different in appearance, (b) when demand is large, and the Normal approxima- 
tion to the Poisson is used, the results given by the two formulae are not the same, and (c) both give 
reasonable approximations to the true solution, i.e., solution using Poisson demand. The two formulae 
improve as R is close to or smaller than mean lead time demand. 

The primary significance of this work is in reconciling the two approaches. Moreover, the direct 
approach formula is given in a more precise form than we have come across in the literature before. 
For example, see "Hanssmann" [3, pp. 47-48]. 

2. DIRECT APPROACH -POISSON DEMAND 

We assume a stationary Poisson demand process. Given the initial on-hand inventory is N units, 
what is the expected number of time-weighted backorders over the period L? 

Let Y be the total demand, u be the expected value of demand, and Bi. be the time-weighted 
backorders over L. Figure 1 follows to help in the derivation. 

From the above diagram, the total time-weighted backorders are presented by the area fi/.. There 
are Y — N orders outstanding just before the replenishment arrives. Numbering the customers back- 

555 



556 



C. HO 



h. 



mnm 



Figure 1 



wards, let T\ be the waiting time of the (Y — N — i+ l)th customer backordered. Then T\ is the waiting 
time of the last customer. Let the number of backorders given demand of Y be Bl\Y, then 



Y-N 



(2.1) 



B L \Y= £ Ti 
E(B L \Y) = Y E(Ti\Y). 



i=i 



For a Poisson process, given the total demand over the period, the arrival time of each customer 
is a random sample from a uniform distribution. For a proof of this statement, see, for example, Karlin 
[4, chap. 9]. To simplify the derivation and without loss of generality, L is assumed to be 1. 

The TVs have a Beta distribution,* and 



(2.2) 
so 

thus 

(2.3) 

where 



E(T,\Y) 



Y+l 



*<W=?FTi=FTT 



Y-N 



= 1,2, . . ., Y-N 



(Y-N+l) 



E(B L )=-E 



(Y-N)(Y-N+1] 
Y+l 



1 » (Y-N)(Y-N+1) 

= 2 2 YTi P{Y) 



P(Y) = 



u,.Y 



e-"u 



3. RESULT IMPLICIT IN SOLUTION OF STEADY STATE (R, Q) MODEL 

Consider an (/?, Q) system with fixed procurement lead time L. If demand is stationary and 
Poisson, then using the steady-state approach t the expected total time weighted backorders per year is 

(3.1) E(SSB yr ) = ±r ]T (Y-R-1)[P(Y)-P(Y+Q)] 

V Y=R+1 

where 

QO 

P(Y)=J i p(k). 



*For reference, see Fisz [1, p. 377]. 

tSee, for example, Hadley & Whitin [2, p. 184]. 



CALCULATION OF BACKORDERS 557 

Now as Q gets large, the probability that lead time demand will exceed Q approaches 0. If we 
assume the probability is effectively 0, andL is 1, then 

(3.2) E(SSB yr )=^ f (Y-R-l)P(Y) 
and 

(3.3) E{SSBcy)=^E(SSBy r )=l £ (Y-R-l)P(Y), 

Y = R + l 

where E(SSB C y) is expected time-weighted backorders per cycle. A cycle can be defined as the time 
between initiation of successive procurements. Equation (3.3) follows from the fact that if lead time 
demand does not exceed Q, there can be at most one procurement order outstanding at a time. If so, 
this implies cyclic statistics such as cyclic backorders can be determined from yearly statistics as 
shown.* 

An implication of the assumption that there is never more than one order outstanding is that all/? 
assets at the time an order is placed are on hand. Thus, backorders sustained over a cycle are equivalent 
to backorders sustained during that part of the cycle which begins with the start of the procurement 
lead time. This establishes an identity between backorders in a cycle and the problem posed in Section 2 
(with R substituted for N). 

n±\ sv» \ rro * 1 * (Y-R)(Y-R + 1) 

(3-4) E(B L )=E(B cy ) = 2 2 y+l P ^' 

V — R 

4. RECONCILIATION OF DIRECT APPROACH TO STEADY STATE (R, Q) MODEL 

Both Eqs. (3.3) and (3.4) are expected time-weighted backorders over a procurement lead time. 
A few steps are needed to show that they are identical. 

E(SSB cy )=- f (Y-R-l)P(Y) 

U Y=R+\ 

=1 1 (Y-R-l,±P(k) 

/=R+l k=Y 

=\ X 2 (Y-R-Dp(k). 

Y=R+\ k=Y 

Interchanging the order of summation, we get 

i t p(*> i (Y-R-D 

k=R+\ Y=R+l 

1 x k-R-l 

=- u 2 P (k) 2 i 

k=R+l i=o 

1 ^ ... (k-R-D(k-R) 



*See. for example, Hadley & Whitin [2, Chap. 4]. 



558 c. ho 

Substitute Y=k— 1 to get 



\% iy - R) T R+1) >*r+»> 



Y = R 



then use p(Y + 1) = TTT T p(Y) (which is a Poisson property) to obtain 

1 ^, (Y-R)(Y-R + 1) 

2 2 Y+l ^' 

which is the same as Eq. (3.4). 

5. NORMAL APPROXIMATION TO POISSON DEMAND 

As total lead time demand increases, the Poisson distribution can be approximated by the Normal. 
Using the Normal approximation to Eq. (3.4) we have 

(5..) *«^-jj; g -«ff-« +i) jg)». 



where 



^vfe-(-H^) 2 ). 



u is the expected lead time demand and cr equals Vu. Using the Normal approximation to Eq. (3.3), 

we have 

(5.2) E(SSB cy )=- | (Y-R)G(Y)dY where G(Y)= \ f{t) dt. 

« J K J Y 

Let w = G(Y) and dv= (Y-R)dY, and use integration by parts in (5.2). This gives 

= ^- [ X {Y-RYf{Y)dY, 

£U J R 

which is different from Eq. (5.1). 

The discrepancy is due to the use of Normal approximation. In equating Eqs. (3.3) and (3.4), the 
key relation of 

p(Y+l)=y±- [ p(Y), 

is used which holds only for the Poisson distribution and not, the Normal distribution. The two formulae 
are essentially the same, and empirically were found to be almost equally effective as approximations 
to the solution obtained using Poisson distribution of demand. 

ACKNOWLEDGMENTS 

I am indebted to Mr. Alan Kaplan and Mr. W. Karl Kruse for their contributions to this work. 
Appreciation is also extended to Professor Edward Silver for his helpful comments. 



CALCULATION OF BACKORDERS 559 

REFERENCES 

[1] Fisz, M., Probability Theory and Mathematical Statistics (John Wiley & Sons, Inc., New York 

and London, 1963). 
[2] Hadley, G. and Whitin, T. M., Analysis of Inventory Systems (Prentice-Hall, Inc., Englewood 

Cliffs, N. J., 1963). 
[3] Hanssmann, F., Operations Research in Production and Inventory Control (John Wiley & Sons, 

Inc., New York and London, 1962). 
[4] Karlin, S., A First Course in Stochastic Processes (Academic Press, New York and London, 1966). 



INDEX OF VOLUME 17 

AHSANULLAH, M. and A. K. MD. E. SALEH, "Optimum Allocation of Quantiles in Disjoint Intervals 

for the Blues of the Parameters of Exponential Distribution when the Sample is Censored in the 

Middle," Vol. 17, No. 3, Sept. 1970, pp. 331-349. 
ALTER, R. and B. LIENTZ, "A Note on a Problem of Smirnov: A Graph Theoretic Interpretation," 

Vol. 17, No. 3, Sept. 1970, pp. 407-408. 
BALAS, E., "Machine Sequencing: Disjunctive Graphs and Degree-Constrained Subgraphs," Vol. 17, 

No. 1, Mar. 1970, pp. 1-10. 
BEGED-DOV, A. G., "Contract Award Analysis by Mathematical Programming," Vol. 17, No. 3, Sept. 

1970, pp. 297-307. 
BELL, C. E., "Multiple Dispatches in a Poisson Process," Vol. 17, No. 1, Mar. 1970, pp. 99-102. 
BELLMORE, M., G. BENNINGTON and S. LUBORE, "A Network Isolation Algorithm," Vol. 17, 

No. 4, Dec. 1970, pp. 461-469. 
BENNINGTON, G., M. BELLMORE and S. LUBORE, "A Network Isolation Algorithm, Vol. 17, 

No. 4, Dec. 1970 pp. 461-469. 
BENNINGTON, G., and S. LUBORE, "Resource Allocation for Transportation," Vol. 17, No. 4, Dec. 

1970, pp. 471-484. 
BHASHYAM, N., "Stochastic Duels with Lethal Dose," Vol. 17, No. 3, Sept. 1970, pp. 397-405. 
BHASHYAM, N., "Stochastic Duels with Nonrepayable Weapons," Vol. 17, No. 1, Mar. 1970, pp. 

121-129. 
BLUMENTHAL, S., "Interval Estimation of the Normal Mean Subject to Restrictions, When the Vari- 
ance Is Known," Vol. 17, No. 4, Dec. 1970, pp. 485-505. 
BOWMAN, V. J., JR. and G. L. NEMHAUSER, "A Finiteness Proof for Modified Dantzig Cuts in 

Integer Programming," Vol. 17, No. 3, Sept. 1970, pp. 309-313. 
BRANDT, E. B. and D. R. LIMAYE, "MAD: Mathematical Analysis of Downtime," Vol. 17, No. 4, 

Dec. 1970, pp. 525-534. 
BROWN, R. G. and G. GERSON, "Decision Rules for Equal Shortage Policies," Vol. 17, No. 3, Sept. 

1970, pp. 351-358. 
BURT, J. M., JR., D. P. GAVER, and M. PERLAS, "Simple Stochastic Networks: Some Problems and 

Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459. 
COZZOLINO, J. M., "The Optimal Burn-In Testing of Repairable Equipment," Vol. 17, No. 2, June 

1970, pp. 167-181. 
CREMEANS, J. E., R. A. SMITH, and G. R. TYNDALL, "Optimal Multicommodity Network Flows 

with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279. 
DAY, J. E. and M. P. HOTTENSTEIN, "Review of Sequencing Research," Vol. 17, No. 1, Mar. 1970, 

pp. 11-39. 
DISNEY, R. L. and W. E. MITCHELL, "A Solution for Queues with Instantaneous Jockeying and Other 

Customer Selection Rules," Vol. 17, No. 3, Sept. 1970, pp. 315-325. 
DUDEWICZ, E. J., "Confidence Intervals for Ranked Means," Vol. 17, No. 1, Mar. 1970, pp. 69-78. 
EVANS, J. P., "On Constraint Qualifications in Nonlinear, Programming," Vol. 17, No. 3, Sept. 1970, 

pp. 281-286. 

561 



562 

GAVER, D. P., J. M. BURT, JR., and M. PERLAS, "Simple Stochastic Networks: Some Problems and 

Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459. 
GERSON, G. and R. G. BROWN, "Decision Rules for Equal Shortage Policies," Vol. 17, No. 3, Sept. 

1970, pp. 351-358. 
GOODMAN, I. F., "Statistical Quality Control of Information," Vol. 17, No. 3, Sept. 1970, pp. 389-396. 
HABER, S. E. and R. SITGREAVES, "A Methodology for Estimating Expected Usage of Repair Parts 

with Application to Parts with No. Usage History," Vol. 17, No. 4, Dec. 1970, pp. 535-546. 
HARRIS, M. Y., "A Mutual Primal-Dual Linear Programming Algorithm," Vol. 17, No. 2, June 1970, 

pp. 199-206. 
HARTMAN, J. K. and L. S. LASDON, "A Generalized Upper Bounding Method for Doubly Coupled 

Linear Programs," Vol. 17, No. 4, Dec. 1970, pp. 411-429. 
HITCHCOCK, D. F. and J. B. MACQUEEN, "On Computing the Expected Discounted Return in a 

Markov Chain," Vol. 17, No. 2, June 1970, pp. 237-241. 
HO, C, "A Note on the Calculation of Expected Time-Weighted Backorders Over A Given Interval," 

Vol. 17, No. 4, Dec. 1970, pp. 555-559. 
HOTTENSTEIN, M. P. and J. E. DAY, "Review of Sequencing Research," Vol. 17, No. 1, Mar. 1970, 

pp. 11-39. 
JACQUETTE, S. C, "Suboptimal Ordering Policies Under the Full Cost Criterion," Vol. 17, No. 1, 

Mar. 1970, pp. 131-132. 
KALMAN, P. J., "A Stochastic Constrained Optimal Replacement Model" Vol. 17, No. 4, Dec. 1970, 

pp. 547-553. 
KAPLAN, A. J., "The Relationship Between Decision Variables and Penalty Cost Parameter in (Q, R) 

Inventory Models," Vol. 17, No. 2, June 1970, pp. 253-258. 
LASDON, L. S. and J. K. HARTMAN, "A Generalized Upper Bounding Method for Doubly Coupled 

Linear Programs," Vol. 17, No. 4, Dec. 1970, pp. 411-429. 
LIENTZ, B. and R. ALTER, "A Note on a Problem of Smirnov: A Graph Theoretic Interpretation," 

Vol. 17, No. 3, Sept. 1970, pp. 407-408. 
LIMAYE, D. R. and E. B. BRANDT, "MAD: Mathematical Analysis of Downtime," Vol. 17, No. 4, 

Dec. 1970, pp. 525-534. 
LUBORE, S., M. BELLMORE and G. BENNINGTON, "A Network Isolation Algorithm," Vol. 17, 

No. 4, Dec. 1970, pp. 461-469. 
LUBORE, S. and G. BENNINGTON, "Resource Allocation for Transportation," Vol. 17, No. 4, Dec. 

1970, pp. 471-484. 

MCMASTERS, A. W. and T. M. MUSTIN, "Optimal Interdiction of a Supply Network," Vol. 17, No. 
3, Sept. 1970, pp. 261-268. 

MACQUEEN, J. B. and D. F. HITCHCOCK, "On Computing the Expected Discounted Return in a 
Markov Chain," Vol. 17, No. 2, June 1970, pp. 237-241. 

MALIK, H. J., "The Distribution of the Product of Two Non-Central Beta Variates," Vol. 17, No. 3, 
Sept. 1970, pp. 327-330. 

MANN, N. R., "Computer-Aided Selection of Prior Distributions for Generating Monte Carlo Con- 
fidence Bounds on System Reliability," Vol. 17, No. 1, Mar. 1970, pp. 41-54. 



563 

MARKLAND, R. E., "A Comparative Study of Demand Forecasting Techniques for Military Helicopter 

Spare Parts," Vol. 17, No. 1, Mar. 1970, pp. 103-119. 
MAZUMDAR, M., "Some Estimates of Reliability Using Interference Theory," Vol. 17, No. 2, June 

1970, pp. 159-165. 
MITCHELL, W. E. and R. L. DISNEY, "A Solution for Queues with Instantaneous Jockeying and 

Other Customer Selection Rules," Vol. 17, No. 3, Sept. 1970, pp. 315-325. 
MOGLEWER, S. and C. PAYNE, "A Game Theory Approach to Logistics Allocation," Vol. 17, No. 1, 

Mar. 1970, pp. 87-97. 
MOREY, R. C, "Inventory Systems with Imperfect Demand Information," Vol. 17, No. 3, Sept. 1970, 

pp. 287-295. 
MUSTIN, T. M. and A. W. MCMASTERS, "Optimal Interdiction of A Supply Network," Vol. 17, No. 

3, Sept. 1970, pp. 261-268. 
NEMHAUSER, G. L. and V. J. BOWMAN, JR., "A Finiteness Proof for Modified Dantzig Cuts in 

Integer Programming," Vol. 17, No. 3, Sept. 1970, pp. 309-313. 
NIGHTENGALE, M. E., "The Value Statement," Vol. 17, No. 4, Dec. 1970, pp. 507-514. 
NOLAN, R. L., "Systems Analysis and Planning-Programming-Budgeting Svstems (PPBS) for Defense 

Decision Making," Vol. 17, No. 3, Sept. 1970, pp. 359-372. 
PAYNE, C. and S. MOGLEWER, "A Game Theory Approach to Logistics Allocation," Vol. 17, No. 1, 

Mar. 1970, pp. 87-97. 
PERLAS, M., J. M. BURT, JR., and D. P. GAVER, "Simple Stochastic Networks: Some Problems 

and Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459. 
PRESUTTI, V. J., JR., and R. C. TREPP, "More Ado About Economic Order Quantities (EOQ)," Vol. 

17, No. 2, June 1970, pp. 243-251. 
ROLFE, A. J., "Markov Chain Analysis of a Situation Where Cannibalization is the Only Repair 

Activity," Vol. 17, No. 2, June 1970, pp. 151-158. 
SALEH, A. K. MD. E. and M. AHSANULLAH, "Optimum Allocation of Quantiles in Disjoint Intervals 

for the Blues of the Parameters of Exponential Distribution when the Sample is Censored in the 

Middle," Vol. 17, No. 3, Sept. 1970, pp. 331-349. 
SCHAFER, R. E. and N. D. SINGPURWALLA, "A Sequential Bayes Procedure for Reliability Demon- 
stration," Vol. 17, No. 1, Mar. 1970, pp. 55-67. 
SCHRADY, D. A., "Operational Definitions of Inventory Record Accuracy," Vol. 17, No. 1, Mar. 

1970, pp. 133-142. 
SCOTT, M., "A Queueing Process with Varying Degree of Service," Vol. 17, No. 4, Dec. 1970. pp. 515— 

523. 
SINGPURWALLA, N. D. and R. E. SCHAFER, "A Sequential Bayes Procedure for Reliability Demon- 
stration," Vol. 17, No. 1, Mar. 1970, pp. 55-67. 
SITGREAVES, R. and S. E. HABER, "A Methodology for Estimating Expected Usage of Repair 
Parts with Application to Parts with No Usage History," Vol. 17, No. 4, Dec. 1970, pp. 535-546. 
SMITH, R. A., J. E. CREMEANS, and G. R. TYNDALL, "Optimal Multicommodity Network Flows 

with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279. 
SPIVEY, W. A. and H. TAMURA, "Goal Programming in Econometrics," Vol. 17, No. 2, June 1970. 

pp. 183-192. 



564 

STEINBERG. D. I., "The Fixed Charge Problem," Vol. 17, No. 2, June 1970, pp. 217-235. 
STERNLIGHT, D., "The Fast Deployment Logistic Ship Project: Economic Design and Decision 

Techniques," Vol. 17, No. 3, Sept. 1970, pp. 373-387. 
TAMURA, H. and W. A. SPIVEY, "Goal Programming in Econometrics," Vol. 17, No. 2, June 1970, 

pp. 183-192. 
TREPP, R. C. and V. J. PRESUTTI, JR., "More Ado About Economic Order Quantities (EOQ)," 

Vol. 17, No. 2, June 1970, pp. 243-251. 
TYNDALL, G. R., J. E. CREMEANS, and R. A. SMITH, "Optimal Multicommodity Network Flows 

with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279. 
VON LANZENAUER, C. H., "Production and Employment Scheduling in Multistage Production 

Systems," Vol. 17, No. 2, June 1970, pp. 193-198. 
WOLLMER, R. D., "Interception in a Network," Vol. 17, No. 2, June 1970, pp. 207-216. 
YASUDA, Y., "A Note on the Core of a Cooperative Game without Side Payment," Vol. 17, No. 1, 

Mar. 1970, pp. 143-149. 
ZACKS, S., "A Two-Echelon Multi-Station Inventory Model for Navy Applications," Vol. 17, No. 1, 

Mar. 1970, pp. 79-85. 
ZWART, P. B., "Nonlinear Programming— The Choice of Direction by Gradient Projection," Vol. 17, 

No. 4, Dec. 1970, pp. 431-438. 

O U. S. GOVERNMENT PRINTING OFFICE : 1971 433-702/3 



INFORMATION FOR CONTRIBUTORS 

The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of 
scientific information in logistics and will publish research and expository papers, including those 
in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve 
the efficiency and effectiveness of logistics operations. 

Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL 
RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217. 
Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one 
or more referees. 

Manuscripts submitted for publication should be typewritten, double-spaced, and the author 
should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted 
with the original. 

A short abstract (not over 400 words) should accompany each manuscript. This will appear 
at the head of the published paper in the QUARTERLY. 

There is no authorization for compensation to authors for papers which have been accepted 
for publication. Authors will receive 250 reprints of their published papers. 

Readers are invited to submit to the Managing Editor items of general interest in the field 
of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections 
of the QUARTERLY. 



NAVAL RESEARCH DECEMBER 19 

LOGISTICS VOL. 17, NO. i 

QUARTERLY NAVSO P-1278 



CONTENTS 

ARTICLES Pa 

A Generalized Upper Bounding Method for Doubly Coupled Linear Programs by J. K. 4] 
Hartman and L. S. Lasdon 

Nonlinear Programming — The Choice of Direction by Gradient Projection by P. B. Zwart 42 

Simple Stochastic Networks: Some Problems and Procedures by J. M. Burt, Jr., D. P. 43 
Gaver and M. Perlas 

A Network Isolation Algorithm by M. Bellmore, G. Bennington and S. Lubore 4( 

Resource Allocation for Transportation by G. Bennington and S. Lubore 4' 

Interval Estimation of the Normal Mean Subject to Restrictions, when the Variance Is 48 
Known by S. Blumenthal 

The Value Statement by M. E. Nightengale 5C 

A Queueing Process with Varying Degree of Service by M. Scott 51 

MAD: Mathematical Analysis of Downtime by E. B. Brandt and D. R. Limaye 52 

A Methodology for Estimating Expected Usage of Repair Parts with Application to Parts 53 
with No Usage History by S. E. Haber and R. Sitgreaves 

A Stochastic Constrained Optimal Replacement Model by P. J. Kalman 54 

A Note on the Calculation of Expected Time-Weighted Backorders over a Given Interval 55 
by C. Ho 



OFFICE OF NAVAL RESEARCH 
Arlington, Va. 22217